我有一个包含39000个单词的列表。我需要计算列表中每个字母的出现次数,并将它们存储在字典中,其中字母作为键,出现次数作为值。怎么做?
有问题的列表是
['voluptuous',
'outbreak',
'starched',
'sharpest',
'widens',
'briefcase',
'stag',
'gracias',
'complexes',
'magnum',
'classifying',
'eloquent',
'forecasters',
'shepherd',
'vestments',
'indestructible',
'chartres',
'condemning',
'closet',
'davis',
'students',
.
.
.
所以,预期的输出应该是这样的
{'a': 2433,
'b': 5717,
'c': 1236,
'd': 12255,
'e': 35170,
'f': 4118,
'g': 8630,
'h': 7327,
'i': 26075,
'j': 6430,
'k': 2965,
'l': 16703,
'm': 8672,
'n': 22630,
'o': 19199,
'p': 8543,
'q': 5325,
'r': 22104,
's': 23730,
't': 20649,
'u': 10196,
'v': 3427,
'w': 2799,
'x': 828,
'y': 5344,
'z': 1031}
答案 0 :(得分:1)
这是使用collections.Counter
的变体:
from collections import Counter
counter = Counter()
words = ['voluptuous',
'outbreak',
'starched',
'sharpest',
'widens',
'briefcase',
'stag',
'gracias',
'complexes',
'magnum',
'classifying',
'eloquent',
'forecasters',
'shepherd',
'vestments',
'indestructible',
'chartres',
'condemning',
'closet',
'davis',
'students']
for word in words:
counter += Counter(word)
或一行:
counter = Counter(char for word in words for char in word)
答案 1 :(得分:0)
您可以使用 <dependency>
<groupId>net.sourceforge.tess4j</groupId>
<artifactId>tess4j</artifactId>
<version>4.3.0</version>
</dependency>
和Counter()
:
chain.from_iterable()