Menu
numiqo

Hierarchical Cluster Analysis

Author: Dr. Hannah Volk-Jesussek
Updated:

Sample data

A hierarchical cluster analysis is a clustering method that creates a hierarchical tree or dendrogram of the objects to be clustered.

Hierarchical cluster analysis dendrogram

The tree represents the relationships between objects and shows how objects are clustered at different levels.

Example Hierarchical Cluster Analysis

In our example we asked people about how many hours per week they spend on social media platforms and at the gym.

Hierarchical cluster analysis example data

We now want to know if there are clusters in this dataset and perform a hierarchical cluster analysis.

How is a Hierarchical Cluster Analysis calculated?

First, we plot the points in a scatter plot.

Scatter plot Hierarchical Cluster Analysis

With this, we can now start to create the clusters. In the first step we assign a cluster to each point. So we have as many clusters as we have people.

each point a cluster

The goal is to merge clusters step by step until all points are in one cluster.

Calculate clusters Hierarchical cluster analysis

In each step, the clusters that are closest together are merged. What does "closest together" mean?

To do this, we need to determine two things:

  • How the distance between two points is measured.
  • How points in a cluster are connected.

Distance between two points

Let's start with the question of how to calculate the distance between two points. Here are the most common distance measures:

  • Euclidean Distance
  • Manhattan Distance
  • Maximum Distance

Let's take the distance between Max and Caro. The difference on the y-axis is 1 and the difference on the x-axis is 4.

Euclidean Distance

The Euclidean distance is the square root of the sum of the squared differences.

Euclidean distance

Manhattan Distance

The Manhattan distance uses the sum of the absolute differences. So we simply calculate 4 plus 1, which gives a distance of 5.

Manhattan Distance

Maximum Distance

The maximum distance is simply the maximum value of the absolute differences. In this case it is 4.

Maximum Distance

Linkage Methods

Now that we know how to calculate distances between points, we need to determine how to link the points within a cluster.

Linking method Hierarchical cluster analysis

Let's say we have a cluster with the points Joe and Lisa and a cluster with Max and Caro. Now how do we determine the distance between these two clusters? Here are the most popular methods:

  • Single linkage
  • Complete linkage
  • Average linkage

Single-linkage

Single linkage uses the distance between the closest elements in the clusters. This is the distance between Caro and Joe.

Single-linkage

Complete-linkage

Complete linkage uses the distance between the farthest elements in the clusters. So between Max and Joe.

Complete-linkage

Average-linkage

Average linkage uses the average of all pairwise distances. The distance is calculated for each pair and then averaged.

Average-linkage

Example: Hierarchical Cluster Analysis

For our example, we use the Euclidean distance and the single linkage method. So now we need the distance from each cluster to all other clusters.

Distances between clusters

For this we first need to calculate the distance matrix. In the distance matrix we enter the clusters on both dimensions and then calculate the distances from each cluster to every other cluster.

Distance matrix

The distance between Alan and Lisa is given by:

Calculate distance matrix

We can now do this for all other combinations until we have calculated the full distance matrix. Now we can merge the first clusters by finding the smallest distance. This is the case between Joe and Lisa.

Example Hierarchical Cluster Analysis

With this, we now combine Joe and Lisa into one cluster. In our tree diagram or dendrogram we can draw the first connection.

First connection in the tree diagram

Now we need to update our distance matrix. We decided to use the single linkage method. So the distance between two clusters is given by the elements that are closest to each other. For the clusters Alan, Max, and Caro, the closest point to the Lisa and Joe cluster is Joe.

Merge hierarchical clusters

So we calculate the distance from Alan to Joe, the distance from Max to Joe, and the distance from Caro to Joe.

Now we again merge the clusters that are closest. These are Max and Alan.

Hierarchical Clustering Example

In our tree diagram or dendrogram, we can draw the second connection.

Dendrogram Connection

Now we update the distance matrix again. We calculate the distance between Alan and Joe, Caro and Joe, and between Caro and Alan. The smallest distance is between the Caro cluster and the Lisa and Joe cluster.

Hierarchical clustering cluster merge

So we connect these two clusters and draw the third connection in the tree diagram.

Now there are only two clusters left, and we merge them in the last step to obtain the finished dendrogram.

Calculating a large cluster

Calculate hierarchical cluster analysis with numiqo

Sample data

To calculate a hierarchical cluster analysis online, just visit the statistics calculator and copy your own data into the table or use the link to load the dataset. Now we click on Cluster and select Hierarchical Cluster.

If we now click on Social Media and Gym, a hierarchical cluster analysis will be calculated for us. Additionally, we can specify the label, in our case the names of the people.

Calculate hierarchical cluster analysis with numiqo

Now we can specify which linkage method should be used and how the distance should be calculated. We simply take single linkage and the Euclidean distance again.

Calculate hierarchical cluster analysis online

Now we get the results below: the tree plot, a scatter plot, and the elbow plot. In the elbow plot we can choose how many clusters to take. We can see a kink here, so we take 4 clusters. We can select this above, and in the tree plot the four clusters are highlighted in different colors.


Statistics made easy

  • many illustrative examples
  • ideal for exams and theses
  • statistics made easy on 454 pages
  • 6th revised edition (March 2025)
  • Only 8.99 €
Free sample
numiqo

"Super simple written"

"It could not be simpler"

"So many helpful examples"

Cite numiqo: numiqo Team (2026). numiqo: Online Statistics Calculator. numiqo e.U. Graz, Austria. URL https://numiqo.com

Contact & About Us FAQ Privacy Policy Terms and Conditions Statistics Software Minitab alternative Minitab to Excel (Minitab File Converter) SPSS to Excel (SPSS File Converter) SPSS alternative DATAtab is now numiqo