Question

我试图从csv文件创建wordcloud。作为示例，csv文件具有以下结构：

reader = csv.reader(open('namesDFtoCSV', 'r',newline='\n'))
d = {}
for k,v in reader:
d[k] = v

它有更多的行，或多或少1800.第一列有字符串值（名称），第二列有各自的频率（int）。然后，读取文件并将键值行存储在字典中（d），因为稍后我们将使用它来绘制wordcloud：

#Generating wordcloud. Relative scaling value is to adjust the importance of a frequency word.
#See documentation: https://github.com/amueller/word_cloud/blob/master/wordcloud/wordcloud.py
wordcloud = WordCloud(width=900,height=500, max_words=1628,relative_scaling=1,normalize_plurals=False).generate_from_frequencies(d)
plt.imshow(wordcloud, interpolation='bilinear')
plt.axis("off")
plt.show()

一旦我们的字典中充满了值，我就会尝试绘制wordcloud：

Traceback (most recent call last):
File ".........../script.py", line 19, in <module>
wordcloud = WordCloud(width=900,height=500, max_words=1628,relative_scaling=1,normalize_plurals=False).generate_from_frequencies(d)
File "/usr/local/lib/python3.5/dist-packages/wordcloud/wordcloud.py", line  360, in generate_from_frequencies
for word, freq in frequencies]
File "/usr/local/lib/python3.5/dist-packages/wordcloud/wordcloud.py", line 360, in <listcomp>
for word, freq in frequencies]
TypeError: unsupported operand type(s) for /: 'str' and 'float

但是会抛出错误：

def generate_from_frequencies(self, frequencies, max_font_size=None):
    """Create a word_cloud from words and frequencies.
    Parameters
    ----------
    frequencies : dict from string to float
        A contains words and associated frequency.
    max_font_size : int
        Use this font-size instead of self.max_font_size
    Returns
    -------
    self

最后，文档说：

  html {
  width: 100%;
  height: 100%;
  margin: 0px;
  padding: 0px;
  position: relative;
}

body {
  background-size: 100% 100%;
  background-color: #000;
  color: white;
  width: 100%;
  height: 100%;
  margin: 0px;
  padding: 0px;
  position: relative;
}

所以，如果我满足功能的要求，我不明白为什么要把这个错误给我。我希望有人可以帮助我，谢谢。

注意

我使用worldcloud 1.3.1

Answer 1

这是因为字典中的值是字符串，但wordcloud需要整数或浮点数。

运行代码后，检查字典d我得到以下内容。

In [12]: d

Out[12]: {'a': '1', 'b': '2', 'c': '4', 'j': '20'}

注意数字周围的' '表示这些字符串实际上是字符串。

解决此问题的一种愚蠢方法是将v投射到int循环中的FOR，如：

d[k] = int(v)

我说这是hacky，因为它可以处理整数，但如果你的输入中有浮动，那么它可能会导致问题。

此外，Python错误可能难以阅读。您上面的错误可以解释为

script.py", line 19

TypeError: unsupported operand type(s) for /: 'str' and 'float

“我的文件第19行或之前有类型错误。让我来看看我的数据类型，以查看字符串和。之间是否存在任何不匹配浮...“

以下代码适用于我：

import csv
from wordcloud import WordCloud
import matplotlib.pyplot as plt

reader = csv.reader(open('namesDFtoCSV', 'r',newline='\n'))
d = {}
for k,v in reader:
    d[k] = int(v)

#Generating wordcloud. Relative scaling value is to adjust the importance of a frequency word.
#See documentation: https://github.com/amueller/word_cloud/blob/master/wordcloud/wordcloud.py
wordcloud = WordCloud(width=900,height=500, max_words=1628,relative_scaling=1,normalize_plurals=False).generate_from_frequencies(d)

plt.imshow(wordcloud, interpolation='bilinear')
plt.axis("off")
plt.show()

Answer 2

# LEARNER CODE START HERE
file_c=""
for index, char in enumerate(file_contents):
    if(char.isalpha()==True or char.isspace()):
        file_c+=char
file_c=file_c.split()
file_w=[]
for word in file_c:
    if word.lower() not in uninteresting_words and word.isalpha()==True:
    file_w.append(word)
frequency={}
for word in file_w:
    if word.lower() not in frequency:
        frequency[word.lower()]=1
    else:
        frequency[word.lower()]+=1
#wordcloud
cloud = wordcloud.WordCloud()
cloud.generate_from_frequencies(frequency)
return cloud.to_array()

带有generate_from_frequencies的Wordcloud Python

2 个答案: