我想查找wanted
中某些单词的频率,虽然它找到了频率,但显示的结果中包含大量不必要的数据。
代码:
from collections import Counter
import re
wanted = "whereby also thus"
cnt = Counter()
words = re.findall('\w+', open('C:/Users/user/desktop/text.txt').read().lower())
for word in words:
if word in wanted:
cnt[word] += 1
print (cnt)
结果:
Counter({'e': 131, 'a': 119, 'by': 38, 'where': 16, 's': 14, 'also': 13, 'he': 4, 'whereby': 2, 'al': 2, 'b': 2, 'o': 1, 't': 1})
问题:
提前感谢您的帮助。
答案 0 :(得分:1)
正如其他人指出的那样,您需要将字符串wanted
更改为列表。我刚刚对列表进行了硬编码,但如果在函数中传递了一个字符串,则可以使用str.split(" ")
。我还为你实现了频率计数器。就像一张纸条一样,请确保关闭文件;使用open
指令也更容易(并且推荐)。
from collections import Counter
import re
wanted = ["whereby", "also", "thus"]
cnt = Counter()
with open('C:/Users/user/desktop/text.txt', 'r') as fp:
fp_contents = fp.read().lower()
words = re.findall('\w+', fp_contents)
for word in words:
if word in wanted:
cnt[word] += 1
print (cnt)
total_cnt = sum(cnt.values())
print(float(total_cnt)/len(cnt))
答案 1 :(得分:0)
我制作了Axel代码的这个小模型来读取网络上的txt,爱丽丝梦游仙境,以应用代码(因为我没有你的txt文件,我想尝试一下)。所以,我在这里发布它,以防有人需要这样的东西。
from collections import Counter
import re
from urllib.request import urlopen
testo = str(urlopen("https://www.gutenberg.org/files/11/11.txt").read())
wanted = ["whereby", "also", "thus", "Alice", "down", "up", "cup"]
cnt = Counter()
words = re.findall('\w+', testo)
for word in words:
if word in wanted:
cnt[word] += 1
print(cnt)
total_cnt = sum(cnt.values())
print(float(total_cnt) / len(cnt))
输出
Counter({'Alice': 334, 'up': 97, 'down': 90, 'also': 4, 'cup': 2})
105.4
>>>
这个答案(来自问题的作者)要求查找在相邻句子中找到一个单词的次数。如果在一个句子中有更多相同的单词(例如:'有')而在下一个单词中有另一个相等的单词,我将其计为1个成熟。这就是我使用wordfound列表的原因。
from collections import Counter
import re
testo = """There was nothing so VERY remarkable in that; nor did Alice think it so? Thanks VERY much. Out of the way to hear the Rabbit say to itself, 'Oh dear! Oh dear! I shall be late!' (when she thought it over afterwards, it occurred to her that she ought to have wondered at this, but at the time it all seemed. Quite natural); but when the Rabbit actually TOOK A WATCH OUT OF ITS? WAISTCOAT-POCKET, and looked at it, and then hurried on.
Alice started to her feet, for it flashed across her mind that she had never before seen a rabbit. with either a waistcoat-pocket, or a watch to take out of it! and burning with curiosity, she ran across the field after it, and fortunately was just in time to see it pop? Down a large rabbit-hole under the hedge.
Alice opened the door and found that it led into a small passage, not much larger than a rat-hole: she knelt down and looked along the passage into the loveliest garden you ever saw. How she longed to get out of that dark hall, and wander about among those beds of bright flowers and those cool fountains, but she could not even get her head through the doorway; 'and even if my head would go through,' thought poor Alice, 'it would be of very little use without my shoulders. Oh, how I wish I could shut up like a telescope! I think I could, if I only knew how to begin.'For, you see, so many out-of-the-way things had happened lately, that Alice had begun to think that very few things indeed were really impossible. There seemed to be no use in waiting by the little door, so she went back to the table, half hoping she might find another key on it, or at any rate a book of rules for shutting people up like telescopes: this time she found a little bottle on it, ('which certainly was not here before,' said Alice,) and round the neck of the bottle was a paper label, with the words 'DRINK ME' beautifully printed on it in large letters. It was all very well to say 'Drink me,' but the wise little Alice was not going to do THAT in a hurry. 'No, I'll look first,' she said, 'and see whether it's marked "poison" or not'; for she had read several nice little histories about children who had got burnt, and eaten up by wild beasts and other unpleasant things, all because they WOULD not remember the simple rules their friends had taught them: such as, that a red-hot poker will burn you if you hold it too long; and that if you cut your finger VERY deeply with a knife, it usually bleeds; and she had never forgotten that, if you drink much from a bottle marked 'poison,' it is almost certain to disagree with you, sooner or later. However, this bottle was NOT marked 'poison,' so Alice ventured to taste it, and finding it very nice, (it had, in fact, a sort of mixed flavour of cherry-tart, custard, pine-apple, roast turkey, toffee, and hot buttered toast,) she very soon finished it off. """
frasi = re.findall("[A-Z].*?[\.!?]", testo, re.MULTILINE | re.DOTALL)
print("How many times this words are repeated in adjacent sentences:")
cnt2 = Counter()
for n, s in enumerate(frasi):
words = re.findall("\w+", s)
wordfound = []
for word in words:
try:
if word in frasi[n + 1]:
wordfound.append(word)
if wordfound.count(word) < 2:
cnt2[word] += 1
except IndexError:
pass
for k, v in cnt2.items():
print(k, v)
输出
How many times this words are repeated in adjacent sentences:
had 1
hole 1
or 1
as 1
little 2
that 1
hot 1
large 1
it 5
to 5
a 6
not 3
and 2
s 1
me 1
bottle 1
is 1
no 1
the 6
how 1
Oh 1
she 2
at 1
marked 1
think 1
VERY 1
I 2
door 1
red 1
of 1
dear 1
see 1
could 2
in 2
so 1
was 1
poison 1
A 1
Alice 3
all 1
nice 1
rabbit 1