Cluster Map

  • This module teaches how to create clustermaps in Seaborn for hierarchical clustering. You will learn to interpret dendrograms, group data effectively, and apply customization options to improve data visualization in Python.
  • What is a Cluster Map?

    Theory

    A Cluster Map is:

    A Heatmap + Hierarchical Clustering
    Automatically groups similar rows and columns

    It helps in:

    • Finding hidden patterns

    • Grouping similar features

    • Identifying clusters

    Unlike heatmap, clustermap:

    ✔ Rearranges rows & columns
    ✔ Adds dendrogram (tree diagram)


    Hierarchical Clustering

    What is Hierarchical Clustering?

    It is a clustering technique where:

    • Data points are grouped step-by-step

    • Similar items merge together

    • Forms a tree-like structure

    There are two types:

    1. Agglomerative (Bottom → Up) (Used in clustermap)

    2. Divisive (Top → Down)

    Example — Cluster Map with Correlation

Cluster Map of Correlation Matrix

This visualization uses Seaborn’s clustermap to display a correlation matrix with hierarchical clustering.

import seaborn as sns
import matplotlib.pyplot as plt

tips = sns.load_dataset("tips")

correlation = tips.corr(numeric_only=True)

sns.clustermap(correlation,
              cmap="coolwarm",
              annot=True)
plt.show()
  • Output Explanation

    You will see:

    • Heatmap in center

    • Tree diagram (dendrogram) on top & left

    • Similar variables grouped together

    Example Insight:

    • total_bill & tip may cluster together

    • size may form separate group


    Dendrogram Understanding

    What is a Dendrogram?

    A Dendrogram is a tree-like diagram that shows:

    • How data points merge

    • Similarity distance

    Key Points:

    ✔ Short branches → Highly similar
    ✔ Long branches → Less similar
    ✔ Cluster height → Distance between groups


    Reading Dendrogram

    If two variables join at small height → Strong similarity
    If they join at large height → Weak similarity

    This helps in:

    • Feature grouping

    • Pattern recognition


    Data Grouping

    Cluster Map automatically:

    ✔ Groups similar rows
    ✔ Groups similar columns
    ✔ Reorders data

    This is useful for:

    • Gene expression analysis

    • Customer segmentation

    • Feature clustering

    Example — Standardized Data

    Sometimes scaling is required:

sns.clustermap(correlation,
              cmap="viridis",
              standard_scale=1)
  • standard_scale=1 → Scale columns
    standard_scale=0 → Scale rows


    Customization

    Change Color Map

sns.clustermap(correlation,
              cmap="YlGnBu")
  • Remove Row Clustering

sns.clustermap(correlation,
              row_cluster=False)
  • Remove Column Clustering

sns.clustermap(correlation,
              col_cluster=False)
  • Change Figure Size

sns.clustermap(correlation,
              figsize=(8,8))
  • Add Line Width

sns.clustermap(correlation,
              linewidths=1,
              cmap="coolwarm")