做下一笔记

wiki里面的定义 http://en.wikipedia.org/wiki/Latent_Dirichlet_allocation

关键所在:it posits that each document is a mixture of a small number of topics and that each word's creation is attributable to one of the document's topics。

将文档看成是一组主题的混合,词有分配到每个主题的概率。

Probabilistic latent semantic analysis(PLSA) LDA可以看成是服从贝叶斯分布的PLSA