我正在尝试创建一个代码,我可以输入一个随机句子,并计算一个字母在此字符串中返回的次数:
def getfreq(lines):
""" calculate a list with letter frequencies
lines - list of lines (character strings)
both lower and upper case characters are counted.
"""
totals = 26*[0]
chars = []
for line in lines:
for ch in line:
chars.append(totals)
return totals
# convert totals to frequency
freqlst = []
grandtotal = sum(totals)
for total in totals:
freq = totals.count(chars)
freqlst.append(freq)
return freqlst
到目前为止,我已经实现了在列表中添加输入的每个字母(字符)。但现在我需要一种方法来计算一个字符在该列表中返回的次数,并以频率表示。
答案 0 :(得分:1)
在collections
模块中有一个非常方便的函数Counter
,它将计算序列中对象的频率:
import collections
collections.Counter('A long sentence may contain repeated letters')
将产生:
Counter({' ': 6,
'A': 1,
'a': 3,
'c': 2,
'd': 1,
'e': 8,
'g': 1,
'i': 1,
'l': 2,
'm': 1,
'n': 5,
'o': 2,
'p': 1,
'r': 2,
's': 2,
't': 5,
'y': 1})
在您的情况下,您可能希望连接您的行,例如在进入''.join(lines)
之前使用Counter
。
如果您想使用原始词典获得类似的结果,您可能希望执行以下操作:
counts = {}
for c in my_string:
counts[c] = counts.get(c, 0) + 1
根据您的Python版本,这可能会更慢,但使用.get()
的{{1}}方法返回现有计数或默认值,然后递增字符串中每个字符的计数
答案 1 :(得分:1)
没有collections.Counter
:
import collections
sentence = "A long sentence may contain repeated letters"
count = collections.defaultdict(int) # save some time with a dictionary factory
for letter in sentence: # iterate over each character in the sentence
count[letter] += 1 # increase count for each of the sentences
或者如果您真的想完全手动完成:
sentence = "A long sentence may contain repeated letters"
count = {} # a counting dictionary
for letter in sentence: # iterate over each character in the sentence
count[letter] = count.get(letter, 0) + 1 # get the current value and increase by 1
在这两种情况下,count
字典都会将每个不同的字母作为其键,其值将是遇到字母的次数,例如:
print(count["e"]) # 8
如果您想让它不区分大小写,请务必在将其添加到计数时调用letter.lower()
。
答案 2 :(得分:0)
您可以使用一组将文本缩小为唯一字符,然后只计算:
text = ' '.join(lines) # Create one long string
# Then create a set of all unique characters in the text
characters = {char for char in text if char.isalpha()}
statistics = {} # Create a dictionary to hold the results
for char in characters: # Loop through unique characters
statistics[char] = text.count(char) # and count them