Introduction
Principal component analysis projects the data in the directions of maximum variance. However, in some cases, maximizing the variance of the data may not achieve highly discriminative features. Fisher’s Linear discriminant analysis (LDA) is also a projection-based method that seeks to find a vector that defines the projection direction of the data [1]. Unlike principal component analysis which maximizes the variance of the data [2], linear discriminant analysis uses the labels of the data and projects the data in the direction that maximizes the separation of the classes as shown in Figure 1.

How do we find the projection that maximizes the class separation of the data? LDA states that maximum separation is achieved when the distance between the two clusters is maximum (or far away from each other). We consider the classification of two classes first, then generalize to more than two classes.
Binary Class LDA
Let
and
denote the means of data after projection which belong to class RED and BLUE respectively.
![]()
Furthermore, let
and
denote the scatters (variances) of data after projection which belong to class RED and BLUE respectively.
![]()
![]()
![]()
To ensure maximum class separation, we want the means to be far away from each other and the projected data in the clusters to be close to each other. In other words, we want the mean difference
to be large and the summation of the scatter
to be small. Therefore, the objective is to maximize
![]()
The numerator can be expressed as

where
is between-class scatter matrix.
The denominator can be expressed as

where
is within-class scatter matrix.
Therefore, the objective function is
![]()
Taking the derivative and setting it equal to zero.

Given that
is a scalar, the expression becomes
![]()
where
is the eigenvector of
and
is the corresponding eigenvalue.
Multi-class LDA
When there are more than two classes, the between-class scatter matrix is defined as
![]()
where
and
is the global (overall) mean.
The within-class scatter matrix is defined as
![]()
where
.
The number of components is defined as follows:
![]()
where
is the number of classes.
Solving for Eigenvector and Eigenvalue
Compute
and
.
Compute
.
where
is an identity matrix.
Solve
to obtain
.
Substitute
into
to obtain
.
Numerical Example
Consider the following dataset

![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
Find
by solving
where
is an identity matrix.
![]()
![]()
![]()
![]()
and ![]()
Substitute ![]()
![]()
![]()
![]()
![]()
Let
, thus ![]()
![]()
References
[1] R. A. Fisher, “The Use of Multiple Measurements in Taxonomic Problems,” Annals of Eugenics, vol. 7, no. 2, pp. 179–188, 1936, doi: 10.1111/j.1469-1809.1936.tb02137.x.
[2] A. M. Martinez and A. C. Kak, “PCA versus LDA,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 23, no. 2, pp. 228–233, Feb. 2001, doi: 10.1109/34.908974.