Question

正如标题所说：

到目前为止，这是我在我的代码工作的地方，但是我无法按顺序显示信息。目前它只是随机显示信息。

def frequencies(filename):
    infile=open(filename, 'r')
    wordcount={}
    content = infile.read()
    infile.close()
    counter = {}
    invalid = "‘'`,.?!:;-_\n—' '"

    for word in content:
        word = content.lower()
        for letter in word:
            if letter not in invalid:
                if letter not in counter:
                    counter[letter] = content.count(letter)
                    print('{:8} appears {} times.'.format(letter, counter[letter]))

非常感谢任何帮助。

Answer 1

字典是无序数据结构。此外，如果你想计算一组数据中的一些项目，你最好使用collections.Counter()，这个目标更加优化和pythonic。

然后，您只需使用Counter.most_common(N)即可打印Counter对象中的大多数N个常用项目。

关于文件的打开，您只需使用with语句自动关闭块末尾的文件。最好不要在函数内打印最终结果，你可以通过产生预期的线条使你的函数成为生成器，然后在你想要的时候打印它们。

from collections import Counter

def frequencies(filename, top_n):
    with open(filename) as infile:
        content = infile.read()
    invalid = "‘'`,.?!:;-_\n—' '"
    counter = Counter(filter(lambda x: not invalid.__contains__(x), content))
    for letter, count in counter.most_common(top_n):
        yield '{:8} appears {} times.'.format(letter, count)

然后使用for循环迭代生成器函数：

for line in frequencies(filename, 100):
    print(line)

Answer 2

You don't need to iterate over 'words', and then over letters in them. When you iterate over a string (like content), you will already have single chars (length 1 strings). Then, you would want to wait untill after your counting loop before showing output. After counting, you could manually sort:

for letter, count in sorted(counter.items(), key=lambda x: x[1], reverse=True):
    # do stuff

However, better use collections.Counter中获取文字：

from collections import Counter

content = filter(lambda x: x not in invalid, content)
c = Counter(content)
for letter, count in c.most_common():  # descending order of counts
    print('{:8} appears {} times.'.format(letter, number))
# for letter, number in c.most_common(n):  # limit to n most
#     print('{:8} appears {} times.'.format(letter, count))

Answer 3

以降序显示需要在搜索循环之外，否则它们将在遇到时显示。

使用内置的sorted（您需要设置reverse - 参数！），按降序排序非常简单！）

然而，python包含电池并且已经有Counter。所以它可以简单如下：

from collections import Counter
from operator import itemgetter

def frequencies(filename):
    # Sets are especially optimized for fast lookups so this will be
    # a perfect fit for the invalid characters.
    invalid = set("‘'`,.?!:;-_\n—' '")

    # Using open in a with block makes sure the file is closed afterwards.
    with open(filename, 'r') as infile:  
        # The "char for char ...." is a conditional generator expression
        # that feeds all characters to the counter that are not invalid.
        counter = Counter(char for char in infile.read().lower() if char not in invalid)

    # If you want to display the values:
    for char, charcount in sorted(counter.items(), key=itemgetter(1), reverse=True):
        print(char, charcount)

计数器已经有一个most_common方法但你想要显示所有字符和计数，所以它不适合这种情况。但是，如果您只想知道x最常见的计数，那么它将是合适的。

Answer 4

您可以使用sorted方法在打印时对字典进行排序：

lettercount = {}
invalid = "‘'`,.?!:;-_\n—' '"
infile = open('text.file')
for c in infile.read().lower():
    if c not in invalid:
        lettercount[c] = lettercount.setdefault(c,0) + 1
for letter in sorted(lettercount):
    print("{} appears {} times".format(letter,lettercount[letter]))

Rmq：当我们第一次见到一封信时，我使用setdefault更改方法将默认值设置为0

我正在尝试计算txt文件中的所有字母，然后按降序显示

4 个答案: