Menu
numiqo

Pearson Correlation

Medical example data Marketing example data

What is Pearson correlation?

Pearson correlation analysis examines the relationship between two variables. For example, is there a correlation between a person's age and salary?

More specifically, we can use the Pearson correlation coefficient to measure the linear relationship between two variables.

Pearson Correlation

Strength and direction of correlation

With a correlation analysis we can determine:

  • How strong the correlation is
  • and in which direction the correlation goes.

We can read the strength and direction of the correlation in the Pearson correlation coefficient r, whose value varies between -1 and 1.

Strength of the correlation

The strength of the correlation can be read in a table. An r between 0 and 0.1 indicates no correlation. An absoulte value of r between 0.7 and 1 indicates a very strong correlation.

Absolute value of r Strength of correlation
0.0 < 0.1 no correlation
0.1 < 0.3 low correlation
0.3 < 0.5 medium correlation
0.5 < 0.7 high correlation
0.7 < 1 very high correlation
From Kuckartz et al.: Statistik, Eine verständliche Einführung, 2013, p. 213

Direction of the correlation

A positive relationship or correlation exists when large values of one variable are associated with large values of the other variable, or when small values of one variable are associated with small values of the other variable.

Positive Pearson correlation coefficient

A positive correlation exists, for example, for height and shoe size. This yields a positive correlation coefficient.

positive correlation coefficient

A negative correlation occurs when large values of one variable are associated with small values of the other variable and vice versa.

negative Pearson correlation coefficient

A negative correlation is usually found between product price and sales volume. This produces a negative correlation coefficient.

negative correlation coefficient

Calculate the Pearson correlation coefficient

The Pearson correlation coefficient is calculated using the following equation. Here r is the Pearson correlation coefficient, xi are the individual values of one variable, e.g., age; yi are the individual values of the other variable, e.g., salary; and and are the mean values of the two variables, respectively.

Equation Pearson Correlation

In the equation, we can see that the respective mean value is first subtracted from both variables.

So in our example, we calculate the mean values of age and salary. We then subtract the mean values from each of age and salary. We then multiply both values.

Then we sum up the individual results of the multiplication. The expression in the denominator ensures that the correlation coefficient is scaled between -1 and 1.

If we now multiply two positive values we get a positive value. If we multiply two negative values we also get a positive value (minus times minus is plus). So all values that lie in these ranges have a positive influence on the correlation coefficient.

Positive correlation Pearson correlation

If we multiply a positive value and a negative value we get a negative value (minus times plus is minus). So all values that are in these ranges have a negative influence on the correlation coefficient.

negative correlation Pearson correlation

Therefore, if our values are predominantly in the two green areas (quadrants) in the previous two figures, we get a positive correlation coefficient and therefore a positive correlation.

If our scores are predominantly in the two red areasy (quadrants) in the figures, we get a negative correlation coefficient and thus a negative correlation.

If the points are distributed over all four areas (quadrants), the positive terms and the negative terms cancel each other out and we might end up with a very small or no correlation.

Testing correlation coefficients for significance

In general, the correlation coefficient is calculated using data from a sample. In most cases, however, we want to test a hypothesis about the population.

In the case of correlation analysis, we then want to know if there is a correlation in the population.

For this, we test whether the correlation coefficient in the sample is statistically significantly different from zero.

Pearson Correlation Sample

Hypotheses for the Pearson correlation coefficient

The null hypothesis and the alternative hypothesis for the Pearson correlation coefficient are thus:

  • Null hypothesis: The correlation coefficient is not significantly different from zero (there is no linear relationship).
  • Alternative hypothesis: The correlation coefficient deviates significantly from zero (there is a linear correlation).

Note: It is always tested whether the null hypothesis is rejected or not rejected.

In our example with the salary and the age of a person, we could thus have the question: Is there a correlation between age and salary in the German population (the population)?

To find out, we draw a sample and test whether the correlation coefficient is significantly different from zero in this sample.

  • The null hypothesis is then: There is no correlation between salary and age in the German population.
  • and the alternative hypothesis: There is a correlation between salary and age in the German population.

Significance and the t-test

Whether the Pearson correlation coefficient is significantly different from zero based on the sample surveyed can be checked using a t-test. Here, r is the correlation coefficient and n is the sample size.

Significance of the Pearson correlation

A p-value can then be calculated from the test statistic t. If the p-value is smaller than the specified significance level, which is usually 5%, then the null hypothesis is rejected, otherwise it is not.

Assumptions of the Pearson correlation coefficient

But what about the assumptions for the Pearson correlation? Here we have to distinguish whether we just want to calculate the Pearson correlation coefficient, or whether we want to test a hypothesis.

To calculate the Pearson correlation coefficient, only two metric variables must be present. Metric variables are, for example, a person's weight, a person's salary, or electricity consumption.

The Pearson correlation coefficient then tells us how large the linear relationship is. If there is a non-linear correlation, we cannot read it from the Pearson correlation coefficient.

Assumptions of the Pearson correlation

However, if we want to test whether the Pearson correlation coefficient is significantly different from zero in the sample, i.e. we want to test a hypothesis, the two variables must also be normally distributed!

Pearson correlation normal distribution

If this assumption is violated, the calculated test statistic t or the p-value cannot be interpreted reliably. If the assumptions are not met, Spearman's rank correlation can be used.

Calculate the Pearson correlation online with numiqo

You can also calculate a correlation analysis online with numiqo. To do this, simply copy your data into this table in the statistics calculator and click on either the Hypothesis tests or Correlation tab.

If you now look at two metric variables, the Pearson correlation coefficient will be calculated automatically. If you don't know exactly how to interpret the results, you can also just click on Summary in words.


Statistics made easy

  • many illustrative examples
  • ideal for exams and theses
  • statistics made easy on 454 pages
  • 6th revised edition (March 2025)
  • Only 8.99 €
Free sample
numiqo

"Super simple written"

"It could not be simpler"

"So many helpful examples"

Cite numiqo: numiqo Team (2026). numiqo: Online Statistics Calculator. numiqo e.U. Graz, Austria. URL https://numiqo.com

Contact & About Us FAQ Privacy Policy Terms and Conditions Statistics Software Minitab alternative Minitab to Excel (Minitab File Converter) SPSS to Excel (SPSS File Converter) SPSS alternative DATAtab is now numiqo