Introducing
Your new presentation assistant.
Refine, enhance, and tailor your content, source relevant images, and edit visuals quicker than ever before.
Trending searches
K-Mean clustering is an algorithm for classifying or grouping things based on their common attributes or features.
The K in K-Mean stands for the number of groups the things would be classified into.
For example, suppose the teacher tells his 20 students to form a line according to height in an ascending order. It would be fairly easy to have a line from shortest to tallest. The problem would arise if the teacher asks his 20 students to form a line according to their height and weight in an ascending order. This is where K-Mean clustering would be very useful.
In business, K-Mean clustering is a great data mining method because of the decisions the manager can make once he/she knows the groupings of the company’s data. Here’s how K-Mean clustering can help in decision making.
Average Customer Satisfaction
Budget For Employee Training
Average Food Preparation Time
Loans
Next step is to choose how many grouping should there be or what the K should be.
In this case, K is equal to 2 which means the attributes should be grouped into 2 according to...
Having figured out how many groupings there would be, the next step is choosing the centroids for each group.
A centroid is the geometric center for each group.
In the first iteration, choose the centroid at random. In this case, the centroids are the data from month 7 and 8.
Next step is to compute for the distances of the attributes from the centroids
Using the Euclidean Distance
formula:
You will get something like this.
After getting the distances of the attributes per month from Centroid 1 and Centroid 2, the next step would be to find the new set of attributes grouped under Centroid 1 and Centroid 2.
The attributes nearer
to Centroid 1 would be grouped together and same goes for the attributes nearer to Centroid 2.
1 indicates that the particular attributes of that month belong to that centroid.
The last step is finding the coordinate of the new set of centroids. The average coordinates per group would be the new set of centroids as shown in this table...
The new grouping would look something like this...
Following the same steps which are:
1. Get the distances of the attributes per month from the centroids through the use of Euclidean Distance formula
2. Know the which attributes per month is nearer to Centroid 1 and Centroid 2
3. See if the groupings have changed
4. If no, then it is already the correct grouping of the attributes with the K of 2
5. If yes, repeat the whole step until groupings would not change
This is how the whole process should look like..
1st Iteration
2nd Iteration
3rd Iteration
4th Iteration
Since there was no change in the groupings after the 4th iteration, it can be concluded that the data belong to the correct clusters.
An annual data set of a food company that has the following attributes