K-Nearest Neighbors (K-NN)
K-NN is a supervised algorithm used for classification. It means that we have some labelled data upfront which we provide to the model. It understand the dynamics within that data after training algorithm. It then uses those learnings to make inferences on the unseen data i.e. test. In the case of classification this labelled data is discrete in nature.
K-Means
K-Means is an unsupervised algorithm used for clustering. It mean sthat we don’t have labelled data upfront to train the model. Hence the algorithm just relies on the dynamics of the independent features to make inferences on unseen data. Following are the steps of working of K-Means clustering.
K-Means
K-Means is an unsupervised algorithm used for clustering. It mean sthat we don’t have labelled data upfront to train the model. Hence the algorithm just relies on the dynamics of the independent features to make inferences on unseen data. Following are the steps of working of K-Means clustering.
- It randomly pick k centroids/cluster centers. Try to make them near the data but different from one another.
- Then assign each data point to the closest centroid.
- Move the centroids to the average location of the data points assigned to it.
- Repeat the preceding two steps until the assignments don’t change, or change very little.
In the normal K-Means each point gets assigned to one and only one centroid, points assigned to the same centroid belong to the same cluster.
Each centroid is the average of all the points belonging to its cluster, so centroids can be treated as datapoints in the same space as the dataset we are using.
No comments:
Post a Comment