我上传了一个excel文本文件。我想计算每个单词出现的次数,例如:
输出:
was 2
report 1
county 5
increase 2
代码:
news = pd.read_excel('C:\\Users\\farid-PC\\Desktop\\Tester.xlsx')
pd.set_option('display.max_colwidth', 1000)
print(news)
#implement word counter?
当前输出:
Text
0 Trump will drop a bomb on North Korea
1 Building a wall on the U.S.-Mexico border will take literally years
2 Wisconsin is on pace to double the number of layoffs this year.
3 Says John McCain has done nothing to help the vets.
4 Suzanne Bonamici supports a plan that will cut choice for Medicare
任何帮助将不胜感激。
答案 0 :(得分:2)
对于熊猫,请使用document.querySelector('#player:first-child span').innerHTML = dealer
,split
和stack
:
value_counts
使用series = df.Text.str.split(expand=True).stack().value_counts()
(用于展平)和chain.from_iterable
(用于计数)的基于python的替代方法:
Counter
使用以下方法重新创建一系列计数:
from collections import Counter
from itertools import chain
counter = Counter(chain.from_iterable(map(str.split, df.Text.tolist())))
与上面的pandas解决方案相同,并且应该更快,因为不涉及堆叠(series = pd.Series(counter).sort_values(ascending=False)
是一种缓慢的操作)。