Principal components analysis (PCA) is a way to analyze the yield curve. It makes use of historical time series data and implied covariances to find factors that explain the variance in the term structure. Each additional factor is found so that they cumulatively maximize the contribution to the variance.

Based on these factors, volatility functions are obtained to explain the underlying volatility of an interest rate model. For example when implementing a Heath-Jarrow-Merton (HJM) Interest Rate Model there are two main elements used in the model construction process- one is the definition of volatility functions for the model, the other involves a practical way of obtaining prices from the model.

In the example used to illustrate the PCA methodology below we have based our analysis on daily quoted US treasury yield curve rates. If we were deriving volatility functions for the HJM construct however we would first need to derive spot rates and subsequently forward rates from these yield rates before conducting the PCA on the forward rates, as the HJM interest rate models uses the volatility function of forward rates as an input into the model.

The objective of this exercise is to simply illustrate the PCA process which could be expanded further in the context of more complex interest rate modelling processes. As a result the components derived through the analysis carried out below explain the variation in quoted rates and not stripped rates (or forward rates) and the term structure of volatility is taken to be the volatility of the yield curve rates at different maturities.

The PCA process assumes that there are no jumps in the data. Rate data may “jump” due to, for example, the monetary policy setting of interest rates. It is necessary to remove jumps from the underlying data before conducting such an analysis. In our example below however we have made no adjustments for any jumps that may exist in the data.

The steps used in the PCA are as follows:

- Obtain the historical times series rate data. At time t
_{i}we observe rate, r(t_{i}, τ_{j}) for j =1,…,k, where j represents the number of maturities in or dimensionality of the data. - Calculate the difference d
_{i,j}= r(t_{i }+1 , τ_{j})- r(t_{i}, τ_{j}) - Calculate the covariance between all d
_{i,j}and write it in the form of a covariance matrix, Σ. - Find a matrix P such that PΣP’ is a diagonal matrix, where P’ is the inverse of the matrix P.
- The diagonal of the matrix PΣP’ is given by λ
_{j}for j =1,…,k and such that λ_{1}≥…≥λ_{k}≥0

p_{i} , the ith row of the matrix P, is an eigenvector and represents a principal component or volatility component of the term structure. λ_{i} is the eigenvalue associated with it. As there are k rows there are k principal components to the term structure.

## Meaning of Eigenvectors and Eigenvalues

**Eigenvectors**, represented as the p_{i} row of the matrix P, are the components explaining the volatility in a given term structure. Components usually have particular shapes. Graphically for a particular factor this is given as the level of shift across maturities.

In many studies the first component, p1, has been found to be fairly flat, i.e. the level of shift is the same across maturities. It therefore is representative of a parallel shift in the term structure.

The second component is found to be downward sloping, where the level of shift declines as maturities increase, which causes the term structure to tilt or slope.

The third component usually has a hump which accounts for the curvature in the term structure. There are other higher order components, but these first three components usually account for around 90%-95% of the variability in the term structure, so the other components are usually considered as noise and eliminated from further analysis of the components.

An **eigenvalue** assigns the associated eigenvector with a level of relative importance. The greater this value is (or rather the larger the proportion of the eigenvalue relative to the sum of the eigenvalues of all the components), the greater will be the proportion of total variance explained by that particular eigenvector.

In this post we have presented an overview of what PCA is and of the steps taken to carrying out the analysis. We have also defined certain terminology that is used in the analysis, such as eigenvectors and eigenvalues. In the posts that follow we will be looking at how the PCA analysis may be carried out in EXCEL .

If you would like to buy this course as a PDF file or the sample EXCEL sheets, please see the Interest Rate Modelling section at our online finance course store.