Question

这是一个来自Python For Everyone第10章分配10.2的分配，其中问题陈述

编写一个程序来读取mbox-short.txt并找出每个消息的按小时分布。您可以通过查找时间从“从”行拉出小时，然后使用冒号再次拆分字符串。 “来自person@example.com Sat Jan 5 09:14:16 2008” 累积每小时的计数后，打印出按小时排序的计数，如下所示。

所需的输出是

我的代码在这里

`name = raw_input("Enter file:")
if len(name) < 1 : name = "mbox-short.txt"
handle = open(name)
counts = dict()

for line in handle:
    line = line.rstrip()
    if line.startswith("From "):
        parts = line.split()
#        print parts
        time = parts[5]
        pieces = time.split(':')
        hour = pieces[0]
        counts[hour] = counts.get(hour,0)+1
print counts `

可在此处找到文本文件http://www.pythonlearn.com/code/mbox-short.txt 在调试时，我意识到我的编译器多次遍历每一行，以返回每小时过高的值。我确信语法line.startswith("From ")对于只读取预期的行是正确的，因为我在之前的作业中使用过。

如何获得正确的小时频率？

Answer 1

你编写的代码工作得很好。

输出字典未排序。您可以使用sort（计数）返回键的排序列表。有了这些你可以打印你的字典

name = raw_input("Enter file:")
if len(name) < 1 : name = "mbox-short.txt"
handle = open(name)
counts = dict()

for line in handle:
    line = line.rstrip()
    if line.startswith("From "):
        parts = line.split()
        time = parts[5]
        pieces = time.split(':')
        hour = pieces[0]
        counts[hour] = counts.get(hour,0)+1

for key in sorted(counts):
    print key + " " + str(counts[key])

输出

我的程序无法正确地从文件中读取行

1 个答案: