Let’s sidetrack a bit. Let’s say we’re at a party and this party is in the triangular room, and these black dots are people and they’re roaming around the party. Now let’s say we locate some food in a corner, some dessert in the other corner and some music in the other one. So people get drawn to these corners and start walking towards them. Some people like music, others food, others dessert. Some people like this point on the left are undecided between food and dessert. So they stay in the middle. But normally they tend to walk to the red areas and away from the blue ones. Now let’s say we do the opposite. We put a lion in one corner, fire on the other one and radioactive waste in the other one. So now people will do the opposite. They’ll stay away from the corners and gravitate towards a center. They tend to go to the red area which is now in the middle, and stay away from the blue areas. So we have our three scenarios. When we put nice things on the corners, when we put nothing and we put dangerous things. These are examples of dirichlet distributions. In these triangles the dots have more probability of being in the red areas than in the blue ones. We’ll see more on this later but dirichlet distributions have parameter at the corners. If the parameters are small say 0.70, 0.7, and 0.7 then we get the one on the left. If they’re all ones then we get the one in the middle and if they’re all big say five then we get the one on the right. We can think of the parameters as repelling factors. If they are large, they pushed the points away and if they’re small they pulled the points closer to them. So here’s a quiz. If the black dots are documents and the corners of the triangle or three topics; science, politics and sports. Which of these distributions would be the best one for picking our documents? This is a slightly vague question, but try to go with your intuition.