Question

这是我第一次在Stack Overflow上提问，所以如果我的问题太模糊或提供的信息不够，我会提前道歉。

基本上我遇到的问题是我的代码因TypeError而无法运行。

import string

f = open('data/hamlet.txt', 'r')
text = f.read()

alphabet_freq = []

for c in string.ascii_lowercase :
    alphabet_freq.append(text.count(c) + text.count(c.upper()))

alphabet_freq_sum = 0

for _ in alphabet_freq :
    alphabet_freq_sum +=_

letter_frequency = []

for _ in alphabet_freq :
    letter_frequency.append(( _ / alphabet_freq_sum) * 100)

alphabets = list(string.ascii_lowercase)

letter_frequency_in_freq_order = []

for _ in letter_frequency :
    letter_frequency_in_freq_order.append(letter_frequency.pop(max(letter_frequency)))

print(letter_frequency_in_freq_order,letter_frequency)

stacktrace

**make: *** [py3_run] 오류 1                                                                        
Traceback (most recent call last):                                                                
  File "Main.out", line 26, in <module>                                                           
    letter_frequency_in_freq_order.append(letter_frequency.pop(max(letter_frequency)))            
TypeError: integer argument expected, got float**

我认为fucntion max在浮动时不起作用。对？

Answer 1

pop获取从列表中删除的索引，并返回此索引上的值。

你没有给它一个索引 - 但是最大值 - 列表中的索引当然必须是整数。

你可以用.find()来解决问题 - 或者过度思考你的整个方法：

有时使用python库的某些知识更容易做事。

特别是：collections.Counter

这是一个很好的计数专业字典，它会传递你的整个文本一次，并为每个字符添加/增加其键。你正在调用.count() 26次 - 对于string.ascii_lowercase中的每个字符一次 - 每次遍历整个字符串以计算其中出现一次字符的时间。

from collections import Counter
import string

t = """This is not a hamlet text, just a few letters and words.
Including newlines to demonstrate how to optimize your approach.
Of counting character frequencies..."""

# if you want all characters starting with a zero count, you can add and remove them
# if you do not need all characters, skip this step and go to c.update()
c = Counter(string.ascii_lowercase)  # all have count of 1
c.subtract(string.ascii_lowercase)   # now all are present with count of 0

# count all characters, passes text once 
c.update(t.lower()) # you can input your file.read() here, any iterable will do

# sum all character counts
totalSum = sum(c.values()) 

# get the countings ordered max to min as tuples (Char,Count), modify 
# with list comprehension to your float values. They are still ordered 
top_n = [ (a,b/totalSum) for a,b in c.most_common()]


# use this to strip f.e. .,! and \n from the output:
#     top_n = [ (a,b/totalSum) for a,b in c.most_common() if a in string.ascii_lowercase]

import pprint
pprint.pprint(c) 

pprint.pprint(top_n)

输出：

Counter({' ': 22,            't': 15,            'e': 14,            'o': 11,
         'n': 10,            'a': 9,             'i': 9,             'r': 8,
         's': 8,             'c': 6,             'h': 5,             'u': 5,
         '.': 5,             'd': 4,             'l': 4,             'w': 4,
         'f': 3,             'm': 3,             'p': 3,             'g': 2,
         '\n': 2,            'j': 1,             'q': 1,             'x': 1,
         'y': 1,             'z': 1,             ',': 1,             'b': 0,
         'k': 0,             'v': 0})

[(' ', 0.13924050632911392),     ('t', 0.0949367088607595),
 ('e', 0.08860759493670886),     ('o', 0.06962025316455696),
 ('n', 0.06329113924050633),     ('a', 0.056962025316455694),
 ('i', 0.056962025316455694),    ('r', 0.05063291139240506),
 ('s', 0.05063291139240506),     ('c', 0.0379746835443038),
 ('h', 0.03164556962025317),     ('u', 0.03164556962025317),
 ('.', 0.03164556962025317),     ('d', 0.02531645569620253),
 ('l', 0.02531645569620253),     ('w', 0.02531645569620253),
 ('f', 0.0189873417721519),      ('m', 0.0189873417721519),
 ('p', 0.0189873417721519),      ('g', 0.012658227848101266),
 ('\n', 0.012658227848101266),   ('j', 0.006329113924050633),
 ('q', 0.006329113924050633),    ('x', 0.006329113924050633),
 ('y', 0.006329113924050633),    ('z', 0.006329113924050633),
 (',', 0.006329113924050633),     
 ('b', 0.0),     ('k', 0.0),     ('v', 0.0)]

TypeError：期望的整数参数，浮动python3 .....？

1 个答案: