如何从引导的LDA输出制作Wordcloud

时间:2019-02-14 04:41:58

标签: python python-3.x lda topic-modeling

我使用guidelda软件包-https://github.com/vi3k6i5/GuidedLDA创建了带有一些初始种子的主题模型。看起来不错。但是现在我想查看每个主题的频率分布和词云。我该怎么办?

我正在这样访问每个主题中的前10个字,

>>> n_top_words = 10
>>> topic_word = model.topic_word_
>>> for i, topic_dist in enumerate(topic_word):
>>>     topic_words = np.array(vocab)[np.argsort(topic_dist)][:-(n_top_words+1):-1]
>>>     print('Topic {}: {}'.format(i, ' '.join(topic_words)))
Topic 0: game play team win season player second point start victory
Topic 1: company percent market price business sell executive pay plan sale
Topic 2: play life man music place write turn woman old book
Topic 3: official government state political leader states issue case member country
Topic 4: school child city program problem student state study family group

但是,如何确定每个单词出现在主题中的次数并由此产生单词云?因为我不确定这个模型是否能捕捉单词的出现频率。

谢谢。

0 个答案:

没有答案