从频率python数据框架的WordCloud

时间:2016-07-19 18:00:17

标签: python word-cloud

我的数据框如下所示

Int64Index: 14830 entries, 25791 to 10668
Data columns (total 2 columns):
word    14830 non-null object
coef    14830 non-null float64
dtypes: float64(1), object(1)

我尝试用coef作为频率而不是计数来制作文字云 充足的

text = df['word']
WordCloud.generate_from_text(text)
TypeError: generate_from_text() missing 1 required positional argument: 'text'

text = np.array(df['word'])
WordCloud.generate_from_text(text)
TypeError: generate_from_text() missing 1 required positional argument: 'text'

我如何改进此代码&像这样做词云

from wordcloud import WordCloud
wordcloud = WordCloud( ranks_only= frequency).generate(text)
plt.imshow(wordcloud)
plt.axis('off')
plt.show()

感谢

2 个答案:

答案 0 :(得分:7)

对我而言,它创建了一个字典,如下所示:

d = {}
for a, x in bag.values:
    d[a] = x

import matplotlib.pyplot as plt
from wordcloud import WordCloud

wordcloud = WordCloud()
wordcloud.generate_from_frequencies(frequencies=d)
plt.figure()
plt.imshow(wordcloud, interpolation="bilinear")
plt.axis("off")
plt.show()

其中bag是一个pandas DataFrame,其列计数

答案 1 :(得分:0)

首先我们得到元组列表

width: 45%

然后

tuples = [tuple(x) for x in df.values]

这就是全部