基于子字符串的字典计数

时间:2019-01-31 19:02:38

标签: python string dictionary

我试图打开一个文件,遍历每行,并对每行中的特定单词计数以添加到词典中。

出于这个问题,文件中的值看起来像这样:

From some.email@address.ac.za Sat Jan  5 09:14:16 2008
From some.email@address.ac.za Sat Jan  5 09:14:16 2008
From some.email@address.ac.za Fri Jan  5 09:14:16 2008
From some.email@address.ac.za Wed Jan  5 09:14:16 2008
From some.email@address.ac.za Tue Jan  5 09:14:16 2008
From some.email@address.ac.za Sat Jan  5 09:14:16 2008
From some.email@address.ac.za Sat Jan  5 09:14:16 2008
From some.email@address.ac.za Sat Jan  5 09:14:16 2008

我想做的是读入文件,遍历每一行,并保留一周中的几天的计数,然后将其返回到Dictionary中。所以结果看起来像这样:

{'Sat': 5, 'Fri': 1, 'Wed': 1, 'Tue': 1}

我已经读到文件中并在空白处分割,然后追加到列表。之后,我被困住了,无法深入到每个列表中特定的测试部分。

有什么想法吗?

fname = input('Enter the file name: ')
try:
    fhand = open(fname)
except:
    print('File cannot be opened:', fname)
    exit()

counts = dict()
l1 = []
for line in fhand:
    line = line.split()
    l1.append(line)
for date in l1:
    for day in date:
        if day[2] not in counts:
            counts[day] = 1
        else:
            counts[day] += 1

1 个答案:

答案 0 :(得分:2)

from collections import defaultdict
fname = input('Enter the file name: ')
try:
    fhand = open(fname)
except:
    print('File cannot be opened:', fname)
    exit()

counts = defaultdict(int)
l1 = []
for line in fhand:
    line = line.split()
    for word in line:
        if word in ['Sun', 'Mon', 'Tue', 'Wed', 'Thrus', 'Fri', 'Sat']:
            counts[word] += 1
print(counts)  

阅读每一行并将其拆分为单词后,您可以检查单词是否是日期名称,如果是,则更新与该日期相对应的值。