Question

我上传了一个excel文本文件。我想计算每个单词出现的次数，例如：

输出：

was 2
report 1
county 5
increase 2

代码：

 news = pd.read_excel('C:\\Users\\farid-PC\\Desktop\\Tester.xlsx')
 pd.set_option('display.max_colwidth', 1000)
 print(news)
 #implement word counter?

当前输出：

   Text
0  Trump will drop a bomb on North Korea
1  Building a wall on the U.S.-Mexico border will take literally years
2  Wisconsin is on pace to double the number of layoffs this year.
3  Says John McCain has done nothing to help the vets.
4  Suzanne Bonamici supports a plan that will cut choice for Medicare

任何帮助将不胜感激。

Answer 1

对于熊猫，请使用document.querySelector('#player:first-child span').innerHTML = dealer，split和stack：

value_counts

使用series = df.Text.str.split(expand=True).stack().value_counts()（用于展平）和chain.from_iterable（用于计数）的基于python的替代方法：

Counter

使用以下方法重新创建一系列计数：

from collections import Counter
from itertools import chain

counter = Counter(chain.from_iterable(map(str.split, df.Text.tolist())))

与上面的pandas解决方案相同，并且应该更快，因为不涉及堆叠（series = pd.Series(counter).sort_values(ascending=False)是一种缓慢的操作）。

如何计算python数据帧中的单词频率？

1 个答案: