13 – M4 L2b 15 PCA As A Factor Model Pt 1 V3

So, let’s return to the main subject of this lesson, which is factor models of risk. So, how do we use PCA to create a factor model of risk? What are our data? They are a set of time series of stock returns for many, many companies. Our main motivations for using PCA are to reduce the dimensionality of these data and also find a representation of them that captures a maximum amount of their variance. We’ll use this representation or model of risk later when we seek to minimize risk as part of an optimization problem. We’re going to be talking about a lot of matrix multiplications here, so I’m going to try to always write down the dimensions. I think keeping track of the dimensions will help you keep track of what’s going on. It’s certainly helps me. When we use a factor model, remember, we have a representation of the returns that looks like this. In order to produce a factor model of risk using PCA, we need to look at what the PCA algorithm gives us and map its outputs to each of these matrices; the factor exposures, factor returns, and the idiosyncratic risk matrix. So, let’s look at what these symbols represent in terms of matrices. The returns matrix has dimensions of number of companies by number of time points. The matrix of factor exposures has dimensions, number of companies by number of factors. The matrix of factor returns has dimensions of number of factors by number of time points. Finally, the matrix of specific risk has dimensions number of companies by number of time points. Let’s take a look at what we get from running the PCA algorithm. Remember that PCA finds a new basis for the data. If we keep all the PCs, then multiplying the representation of the data in the PC language by the matrix of PCs completely recreates the data in the original basis. On the other hand, if we drop some of the PCs then multiplying the data matrix in the new basis by the matrix of PCs almost recreates the data. This is the compressed representation of the data. If we add in what’s leftover, then we have a complete representation of the data. Now, this is looking a lot like the factor representation we discussed a little bit earlier. In fact, we’re just going to use this as our factor model.

Dr. Serendipity에서 더 알아보기

지금 구독하여 계속 읽고 전체 아카이브에 액세스하세요.

Continue reading