Introduction to K-Nearest-Neighbour Algorithm
The k-nearest neighbors (KNN) algorithm is a simple supervised machine learning algorithm that can be used to solve both classification and regression problems.
Machine learning algorithms can be classified broadly into two types, supervised algorithms and unsupervised algorithms.[For more details]
Classification algorithms are used to classify data instances and regression algorithms are used to estimate or predict real values for new data instances[For more details]
K Nearest Neighbor
The KNN algorithm assumes that similar things are near to eachother. Essentially, KNN algorithm finds the k nearest neighbours of a given data instance and chooses the class label with majority vote and assigns that class label to the data instance.
The K nearest neighbors and found by calculating the distance between the testing data point and the rest of the points in the dataset. The distance can be calculated by many different methods, but the most common method is Euclidean distance.
Other methods of calculating distance include: Euclidean Distance, Hamming distance, Minkowski distance, Kullback-Leiber (KL) divergence, BM25 etc