此代码有效,但我想知道是否还有更Python的方式编写它。
word_frequency
是列表的字典,例如:
word_frequency = {'dogs': [1234, 4321], 'are': [9999, 0000], 'fun': [4389, 3234]}
vocab_frequency = [0, 0] # stores the total times all the words used in each class
for word in word_frequency: # that is not the most elegant solution, but it works!
vocab_frequency[0] += word_frequency[word][0] #negative class
vocab_frequency[1] += word_frequency[word][1] #positive class
是否有更优雅的方式编写此循环?
答案 0 :(得分:8)
我不确定这是否更适合Pythonic:
>>> word_frequency = {'dogs': [1234, 4321], 'are': [9999, 0000], 'fun': [4389, 3234]}
>>> vocab_frequency = [sum(x[0] for x in word_frequency.values()),
sum(x[1] for x in word_frequency.values())]
>>> print(vocab_frequency)
[15622, 7555]
带有reduce
的替代解决方案:
>>> reduce(lambda x, y: [x[0] + y[0], x[1] + y[1]], word_frequency.values())
[15622, 7555]
答案 1 :(得分:4)
您可以为此使用numpy:
import numpy as np
word_frequency = {'dogs': [1234, 4321], 'are': [9999, 0000], 'fun': [4389, 3234]}
vocab_frequency = np.sum(list(word_frequency.values()), axis=0)
答案 2 :(得分:2)
list(map(sum, zip(*word_frequency.values())))
答案 3 :(得分:2)
也许不是解决这个问题的最短方法,但希望是最容易理解的...
word_frequency = {'dogs': [1234, 4321], 'are': [9999, 0000], 'fun': [4389, 3234]}
negative = (v[0] for v in word_frequency.values())
positive = (v[1] for v in word_frequency.values())
vocab_frequency = sum(negative), sum(positive)
print (vocab_frequency) # (15622, 7555)
尽管经验丰富的Pythonista使用者可能宁愿使用zip来解压缩值:
negative, positive = zip(*word_frequency.values())
vocab_frequency = sum(negative), sum(positive)
答案 4 :(得分:1)
另一种方法是:
vocab_frequency[0], vocab_frequency[1] = list(sum([word_frequency[elem][i] for elem in word_frequency]) for i in range(2))
print(vocab_frequency[0])
print(vocab_frequency[1])
输出:
15622
7555
还有,做这件事的另一种方法,有点牵强:
*vocab_frequency, = list(map(sum,zip(*word_frequency.values())))
print(vocab_frequency)
输出:
[15622, 7555]
答案 5 :(得分:1)
for frequencies in word_frequency.values():
vocab_frequency = [sum(x) for x in zip(vocab_frequency, frequencies)]
答案 6 :(得分:1)
您可以将该词典转换为pandas DataFrame,它将更容易处理。
import pandas as pd
word_frequency = {'dogs': [1234, 4321], 'are': [9999, 0000], 'fun': [4389, 3234]}
#Syntax to create DataFrame
df = pd.DataFrame(word_frequency)
#Result
dogs are fun
0 1234 9999 4389
1 4321 0 3234
现在只需取每一行的总和,然后转换回列表或保留为数据框对象。
#Take sum of each row and convert to list
df = df.sum(axis=1)
df = df.values.tolist()
print(df)
#Output
[15622, 7555]
答案 7 :(得分:1)
尝试此单行解决方案:
[sum([word_frequency[i][0] for i in word_frequency]),sum([word_frequency[i][1] for i in word_frequency])]
答案 8 :(得分:0)
for n, p in your_dict.vales():
res[0] += n
res[1] += p
这将足够快速而优雅。 通过电话发送。抱歉,格式。