16 – K-Means Clustering Visualization 2

One of the things that’s immediately apparent once I start assigning my centroids, with these colored regions, is how all the points are going to be associated with one of the centroids, with one of the clusters. So you can see that the blue is probably already in reasonably good shape. I would say that we got a little bit lucky in where the, the initial centroid was placed. It looks like it’s pretty close to the, the center of this blob of data. With the red and the green it looks like they’re sitting kind of right on top of each other in the same cluster. So, let’s watch as K-means starts to sort out this situation and get all the clusters properly allocated. So, I hit Go. The first thing that it does is it tells me explicitly which cluster each one of these points will fall into. So you see, we have a few blue that fall into the wrong cluster over here. And then, of course, the red and the green. So this is the association step is all the points are being associated with the nearest centroid. And then the next thing that I’ll do is I’m going to update the centroid. So now, this is going to move the centroids to the, the mean of all of the associated points. So in particular, I, I expect this green point to be pulled over to the right by the fact that we have so many points over here. So let’s update. Now this is starting to look much better. If we were to just leave everything as is, you can see how the clustering was before. So now all these points that use to be green are now about to become red. And likewise with a few blue points over here. You can see how even just in one step from this bad initial condition, we’ve already started to capture the structure in the data pretty well. So I’m going to reassign the points. Iterate through this again to reassign each point to the nearest centroid. And now things are starting to look very, very consistent. There’s probably just one, one or two more iterations before we have the centroid’s right at the middle of the clusters so I update and reassign points. No points have changed so this is the final clustering that would be assigned by k-means clustering. So in three or four steps, using this algorithm, I assigned every point to a cluster and it worked in a really beautiful way for this example.

Dr. Serendipity에서 더 알아보기

지금 구독하여 계속 읽고 전체 아카이브에 액세스하세요.

Continue reading