Cluster Map
- This module teaches how to create clustermaps in Seaborn for hierarchical clustering. You will learn to interpret dendrograms, group data effectively, and apply customization options to improve data visualization in Python.
What is a Cluster Map?
Theory
A Cluster Map is:
A Heatmap + Hierarchical Clustering
Automatically groups similar rows and columnsIt helps in:
Finding hidden patterns
Grouping similar features
Identifying clusters
Unlike heatmap, clustermap:
✔ Rearranges rows & columns
✔ Adds dendrogram (tree diagram)Hierarchical Clustering
What is Hierarchical Clustering?
It is a clustering technique where:
Data points are grouped step-by-step
Similar items merge together
Forms a tree-like structure
There are two types:
Agglomerative (Bottom → Up) (Used in clustermap)
Divisive (Top → Down)
Example — Cluster Map with Correlation
Cluster Map of Correlation Matrix
This visualization uses Seaborn’s clustermap to display a correlation matrix with hierarchical clustering.
import seaborn as sns
import matplotlib.pyplot as plt
tips = sns.load_dataset("tips")
correlation = tips.corr(numeric_only=True)
sns.clustermap(correlation,
cmap="coolwarm",
annot=True)
plt.show()
Output Explanation
You will see:
Heatmap in center
Tree diagram (dendrogram) on top & left
Similar variables grouped together
Example Insight:
total_bill & tip may cluster together
size may form separate group
Dendrogram Understanding
What is a Dendrogram?
A Dendrogram is a tree-like diagram that shows:
How data points merge
Similarity distance
Key Points:
✔ Short branches → Highly similar
✔ Long branches → Less similar
✔ Cluster height → Distance between groupsReading Dendrogram
If two variables join at small height → Strong similarity
If they join at large height → Weak similarityThis helps in:
Feature grouping
Pattern recognition
Data Grouping
Cluster Map automatically:
✔ Groups similar rows
✔ Groups similar columns
✔ Reorders dataThis is useful for:
Gene expression analysis
Customer segmentation
Feature clustering
Example — Standardized Data
Sometimes scaling is required:
sns.clustermap(correlation,
cmap="viridis",
standard_scale=1)
standard_scale=1 → Scale columns
standard_scale=0 → Scale rowsCustomization
Change Color Map
sns.clustermap(correlation,
cmap="YlGnBu")
Remove Row Clustering
sns.clustermap(correlation,
row_cluster=False)
Remove Column Clustering
sns.clustermap(correlation,
col_cluster=False)
Change Figure Size
sns.clustermap(correlation,
figsize=(8,8))
Add Line Width
sns.clustermap(correlation,
linewidths=1,
cmap="coolwarm")