Exploratory Factor Analysis
Author: Dr. Hannah Volk-Jesussek
Updated:
Factor analysis is a method that aims to uncover structure in large variable sets. If you have a data set with many variables, some of them may be interrelated, i.e., correlate with each other. These correlations are the basis of factor analysis.
The aim of factor analysis is to group variables, separating those that correlate strongly from those that correlate less strongly.
What is a factor?
In factor analysis, a factor can be seen as a hidden variable that influences several observed variables.
In other words, several variables are observable manifestations of fewer underlying factors.
In factor analysis, variables that are highly correlated with each other are combined. It is assumed that this correlation is due to a non-measurable variable, which is called a factor.
Example: Factor Analysis
Factor analysis can be used to answer the following questions:
- What structure can be detected in the data?
- How can the data be reduced to some factors?
The following table shows where factor analysis is used in different fields.
Examples
| Question | Variable | Possible factors | |
|---|---|---|---|
| Psychology | Can different personality traits be grouped into personality types? | Being sociable, spontaneous, curious, nervous, aggressive, etc. | Neuroticism, extraversion, openness to new experiences, conscientiousness, agreeableness |
| Business administration | How can different cost types be summarized in cost characteristics? | Material costs, personnel costs, equipment costs, fixed costs etc. | Influenceability, urgency of coverage |
Research questions for Factor Analysis
A possible research question might be: Can different personality traits such as outgoing, curious, sociable, or helpful be grouped into personality types such as conscientious, extraverted, or agreeable?
You want to find out whether some of the characteristics sociable, spontaneous, hard-working, conscientious, warm-hearted or helpful correlate with each other and can be described by an underlying factor. To find out, you created a small survey with numiqo.
You have interviewed 20 people and have the results output to an Excel table. Here you can find the example data set, which you can use to calculate the example directly online with numiqo under Factor Analysis Calculator.
Factor loadings, eigenvalues, communalities
The key terms in factor analysis are factor loadings, eigenvalues, and communalities. With their help, it is possible to see how strong the relationship between individual variables and factors is.
Factor loading
- Correlation between a variable and a factor
- Loading of a variable on a factor
Eigenvalue
- The variance explained by a factor
- Sum of the squared factor loadings
Communalities
- Variance of the variables, which is explained by all factors
- Sum of the squared factor loadings of a variable
Correlation matrix
The first step in factor analysis is to calculate the correlation matrix. Starting from the correlation matrix, the so-called eigenvalue problem is solved, which is used to calculate the factors.
Factor analysis and dimensionality
It is important to note, however, that factor analysis does not give a clear answer as to how many factors should be used and how these factors should be interpreted.
There are two common methods to determine the number of required factors: the eigenvalue criterion (Kaiser criterion) and the scree test.
Eigenvalue criterion (Kaiser criterion)
To determine the number of factors with the eigenvalue criterion (Kaiser criterion), the eigenvalues of the individual factors are used. All factors with eigenvalues greater than 1 are retained.
Scree test
To determine the number of factors with the scree test or scree plot, the eigenvalues are sorted by size and displayed in a line chart. The number of factors is read at the bend in the chart.
In the "Explained total variance" table, you can see the variance explained by each factor and the cumulative variance.
Communalities
Once the number of factors is determined, the communalities can be calculated. As written above, the communality indicates the variance of a variable that is explained by all factors. If, for example, three factors were selected, the communalities show the proportion of variance in each variable explained by those three factors.
Component matrix
The component matrix indicates the factor loadings of the factors on the variables. Since the first factor explains most of the variance, the values for the first component or factor are the largest. However, in this form it is difficult to interpret the factors, so the matrix is rotated.
Rotation Matrix
The component matrix often shows many variables loading highly on the first factor. This makes the matrix difficult to interpret. Therefore, the matrix is rotated. There are different rotation procedures, but the most common is Varimax rotation.
Varimax Rotation
Varimax rotation aims to make variables load as highly as possible on one factor and as low as possible on others. It does this by maximizing the variance of squared loadings within each factor.
Here you can see that "outgoing" and "sociable" load on Extraversion, "hard-working" and "dutiful" load on Conscientiousness, and "warm-hearted" and "helpful" load on Agreeableness.
Statistics made easy
- many illustrative examples
- ideal for exams and theses
- statistics made easy on 454 pages
- 6th revised edition (March 2025)
- Only 8.99 €
"Super simple written"
"It could not be simpler"
"So many helpful examples"