Wordcloud:较低的tf-idf值的较大字体

时间:2017-10-31 22:11:24

标签: python pandas matplotlib tf-idf word-cloud

我正在尝试为tf-idf值形成一个词云:

以下是数据框中的tf-idf值。当我尝试形成一个词云时,值更高的值,在这种情况下"座位 - 2.57"以最大的字体显示。但我需要反之亦然。 "尼斯 - 2.088"拥有更大的字体,因为它更重要。

[[u'nice' 2.0886619578149417]
 [u'owl' 2.2729656758128876]
 [u'person' 2.386294361119891]
 [u'read' 2.455287232606842]
 [u'seat' 2.5766480896111092]]

以下是代码:

print(top_10.values)
d = {}
for a, x in top_10.values:
    d[a] = x

import matplotlib.pyplot as plt
from wordcloud import WordCloud

wordcloud = WordCloud()
wordcloud.generate_from_frequencies(frequencies=d)
plt.figure( figsize=(20,10) )
plt.imshow(wordcloud, interpolation="bilinear")
plt.axis("off")
plt.show()

1 个答案:

答案 0 :(得分:3)

您可以使用不同的频率提供给wordcloud。

E.g。

d[a] = 3-x

d[a] = 1./x

完整示例:

top_10 = lambda :""
top_10.values = [[u'nice', 2.0886619578149417],
                 [u'owl', 2.2729656758128876],
                 [u'person', 2.386294361119891],
                 [u'read', 2.455287232606842],
                 [u'seat', 2.5766480896111092]]
d = {}
for a, x in top_10.values:
    d[a] = 3-x

import matplotlib.pyplot as plt
from wordcloud import WordCloud

wordcloud = WordCloud()
wordcloud.generate_from_frequencies(frequencies=d)
plt.figure( figsize=(5,3) )
plt.imshow(wordcloud, interpolation="bilinear")
plt.axis("off")
plt.show()

enter image description here