大家好我是python的新手,需要编写程序来消除标点符号,然后计算字符串中的单词数。所以我有这个:
import sys
import string
def removepun(txt):
for punct in string.punctuation:
txt = txt.replace(punct,"")
print txt
mywords = {}
for i in range(len(txt)):
item = txt[i]
count = txt.count(item)
mywords[item] = count
return sorted(mywords.items(), key = lambda item: item[1], reverse=True)
问题是它返回字母并计算它们而不是我所希望的单词。你能帮我解决这个问题吗?
答案 0 :(得分:1)
这个怎么样?
>>> import string
>>> from collections import Counter
>>> s = 'One, two; three! four: five. six@#$,.!'
>>> occurrence = Counter(s.translate(None, string.punctuation).split())
>>> print occurrence
Counter({'six': 1, 'three': 1, 'two': 1, 'four': 1, 'five': 1, 'One': 1})
答案 1 :(得分:0)
numberOfWords = len(txt.split(" "))
假设单词之间有一个空格
编辑:
a={}
for w in txt.split(" "):
if w in a:
a[w] += 1
else:
a[w] = 1
如何运作
输出与Haidro的优秀答案相同,是一个带有单词键和每个单词计数值的词典