Iterations:
0
Number of iterations the K-means algorithm took to converge. Fewer iterations with K-means++ initialization indicates better starting centroids.
Silhouette:
0.000
Silhouette score ranges from -1 to 1, measuring how well-defined the clusters are. Scores above 0.5 indicate good clustering, 0.3-0.5 is acceptable, and below 0.3 suggests overlapping clusters.
Algorithm:
K-MEANS++
K-means++ uses smart initialization by selecting starting centroids that are far apart, leading to faster convergence and better quality. Random initialization picks centroids randomly, which can lead to suboptimal results.
2
4
8
Cluster Visualization
Each point is a customer colored by cluster. Cross marks show cluster centroids. Watch points migrate between clusters as the algorithm iterates.
Step: 1 / 1
Speed:
2.0x
Convergence History
Inertia measures total within-cluster sum of squared distances. As K-means iterates, inertia decreases until convergence. A steeper drop indicates faster optimization.
Elbow Method
The elbow method helps determine optimal K. The "elbow" point where the curve bends indicates diminishing returns from adding more clusters. Silhouette scores measure cluster quality: higher is better.
RFM Cluster Statistics RFM (Recency, Frequency, Monetary) analysis segments customers based on purchase recency, purchase frequency, and total spending. Each cluster represents a distinct customer segment.
README.md
Interactive K-means++ clustering visualization with RFM (Recency, Frequency, Monetary) analysis. Watch the algorithm converge step-by-step with animated scatter plots, then evaluate cluster quality with silhouette scores and the elbow method.
Features
- Adjustable K -- slider to select 2–8 clusters
- Feature space selection -- Monetary × Frequency, Recency × Monetary, or Recency × Frequency
- K-means++ vs Random -- toggle initialization strategy to compare convergence
- Animated convergence -- step through iterations and watch data points migrate between clusters
- Elbow method chart -- visualize inertia across K values to find the optimal cluster count
- Silhouette score -- quantitative measure of cluster cohesion and separation
- RFM segment labels -- clusters are mapped to customer archetypes (Champions, Loyal Customers, At Risk, etc.)
How It Works
Everything runs client-side in JavaScript. The K-means++ algorithm is implemented from scratch:
- Smart centroid initialization (K-means++) selects initial centroids proportional to squared distance
- Assignment step assigns each point to the nearest centroid
- Update step moves centroids to the mean of their assigned points
- Repeat until convergence (centroids stop moving)
Each iteration is rendered as an animation frame on a Chart.js scatter plot.
Tech Stack
- Chart.js -- scatter plots, line charts, and elbow method visualization
- Vanilla JavaScript -- full K-means++ implementation, animation system, state management
- Flask -- serves the page (no backend computation)