Question

我使用guidelda软件包-https://github.com/vi3k6i5/GuidedLDA创建了带有一些初始种子的主题模型。看起来不错。但是现在我想查看每个主题的频率分布和词云。我该怎么办？

我正在这样访问每个主题中的前10个字，

>>> n_top_words = 10
>>> topic_word = model.topic_word_
>>> for i, topic_dist in enumerate(topic_word):
>>>     topic_words = np.array(vocab)[np.argsort(topic_dist)][:-(n_top_words+1):-1]
>>>     print('Topic {}: {}'.format(i, ' '.join(topic_words)))
Topic 0: game play team win season player second point start victory
Topic 1: company percent market price business sell executive pay plan sale
Topic 2: play life man music place write turn woman old book
Topic 3: official government state political leader states issue case member country
Topic 4: school child city program problem student state study family group

但是，如何确定每个单词出现在主题中的次数并由此产生单词云？因为我不确定这个模型是否能捕捉单词的出现频率。

谢谢。

如何从引导的LDA输出制作Wordcloud

0 个答案: