我使用包WordCloud来显示由scikit LDA生成的单词(Latent Dirichlet Allocation)。对于LDA生成的每个主题,我都会有一个图表。我希望能够在网格中绘制所有图表以允许并排显示。 基本上我有一个函数,它将LDA模型作为输入,以及我想要可视化的LDA主题,然后绘制一个wordcloud:
from wordcloud import WordCloud
import matplotlib.pyplot as plt
SEED=0
def topicWordCloud(model, topicNumber, WCmaxWords,WCwidth, WCheight):
topic = model.components_[topicNumber]
tupleList = [(tf_feature_names[i],int(topic[i]/topic.sum()*10000)) for i in range(len(topic))]
wordcloud = WordCloud(width=WCwidth, height=WCheight, max_words=WCmaxWords, random_state=42).generate_from_frequencies(tupleList)
plt.figure( figsize=(20,10) )
plt.imshow(wordcloud)
plt.axis("off")
topicWordCloud(model=lda, topicNumber=2, WCmaxWords=100,WCwidth=800, WCheight=600)
如何遍历我的所有主题(n_topics
)以可视化网格中的所有图表?我正在思考以下几点:
fig = plt.figure()
for i in range(n_topics):
plt.subplot(2,1,i+1)
#something here
答案 0 :(得分:4)
从你的函数中返回wordcloud,然后从for循环中调用topicWordCloud
。然后,使用您使用imshow
创建的Axes
使用fig.add_subplot
。例如,像这样:
def topicWordCloud(model, topicNumber, WCmaxWords,WCwidth, WCheight):
topic = model.components_[topicNumber]
tupleList = [(tf_feature_names[i],int(topic[i]/topic.sum()*10000)) for i in range(len(topic))]
wordcloud = WordCloud(width=WCwidth, height=WCheight, max_words=WCmaxWords, random_state=42).generate_from_frequencies(tupleList)
return wordcloud
fig = plt.figure()
for i in range(n_topics):
ax = fig.add_subplot(2,1,i+1)
wordcloud = topicWordCloud(...)
ax.imshow(wordcloud)
ax.axis('off')