# 6 – M4 L2b 08 Writing It Down Pt 1 V3

Okay. So, what I told you is that we want to maximize the variance of the data in the new basis direction. Now I want to show you exactly how what I showed you in pictures translates into symbols that we write down. I don’t think this is as essential for you to know, because you already have the core idea. But I want to show this to you, so that if you read about PCA in books or online, you’ll be able to follow what’s written, and because for myself I like to go through these things step-by-step to make sure I really understand exactly what’s going on. I think this is important for being able to make inferences about the results of mathematical calculations, and for being able to compare among techniques that do similar things. I want to be able to afford to you the opportunity to understand on a deep level to. So, let’s get started. What we’re going to do here is go through what we already described in a bit more detail, in a bit more rigorously. So, let’s start with something I left out earlier. The very first thing we need to do when we do PCA is make sure the data are centered around zero. What do we mean by that? Let’s say the dataset we started with was over here in the 2D plane. The coordinates of this point are a and b. Here’s the mean of all the points x coordinates, and here’s the mean of all the ports y coordinates. When we begin PCA, if these means are non-zero, we want to subtract the mean in each dimension so that we get a new dataset centered around zero. First, we subtract the x mean from each data point. Then we subtract the y mean from each data point. Now the points are centered around zero,. This is called mean centering or mean normalizing the data.