Well, the answer is, we need one arrow for each pair document word. Since we have 500 documents and 1,000 words, the number of parameters is the number of documents times the number of words. This is 500 times 1,000, which is 500,000. This is too many parameters to figure out. Is there any way we can reduce this number and still keep most of the information? The answer is yes. We can reduce the number of parameters and still catch most of the information. We’ll do this by adding to the model the notion of a small set of topics or latent variables that actually drive the generation of words in each document. So, in this model, any document is considered to have an underlying mixture of topics associated with it. Similarly, a topic is considered to be a mixture of terms that is likely to generate. In here, we have say three topics: Science, politics, and sports. So now, there are two sets of parameters of probability distributions that we need to compute. The probability of a topic given a document d or P(z/d) and the probability of a term P given a topic z, or P(t/z). Our new probability of a document given a term can be expressed as a sum over the two previous probabilities. We’ll see more about this later. So, it’s time for another quiz. If we have 500 documents, 10 topics and 1,000 words, how many parameters does this new model have? Enter your answer below.