将对象放入列表中的字典

时间:2014-12-01 04:36:40

标签: python list object dictionary

我正在尝试用Python创建一个Google Ngram-esque程序(CS-I项目)。我有一个CSV文件,如下所示:

aardvark, 2007, 123948
aardvark, 2008, 120423
aardvark, 2004, 96323
gorilla, 2010, 120302
gorilla, 2008, 89323
raptorjesus, 1996, 214

第一个值代表单词,第二个值是我们计算出现次数的第二个,第三个是出现次数。

我有一个类CountByYear,它接收单词,年份和频率,并返回一个CountByYear对象。

我需要通读CSV文件并打印包含单词作为键的字典,其中CountByYear对象列表为值(不含单词)。例如:

{'aardvark': [CountByYear(year=2007, count=123948), CountByYear(year=2008...etc.], 'gorilla: [CountByYear(year=2010, count=120302), etc...)]

我坚持认为我实际上应该得到一年并计算每个对象。现在我正在做:

for line in f:
    splitLine = line.strip().split(',')
    words[splitLine[0]] = countList
print(words)

打印{aardvark': [], 'gorilla': [], 'raptorjesus': [],这很好,因为至少我知道我正在正确地做字典部分。但是如何用我想要的数据填充这些空列表呢?

3 个答案:

答案 0 :(得分:1)

您没有包含CountByYear类的示例,但您指定它有一个构造函数,其中包含" word"," year"和" frequency&# 34。

假设这样的定义:

class CountByYear(object):
    def __init__(self, word, year, frequency):
        self.word = word
        self.year = year
        self.frequency = frequency

    def __repr__(self):
        return "CountByYear(year=%s, count=%s)" % (self.year, self.frequency)

您可以这样做:

words = {}
for line in f:
    word,year,freq = [i.strip() for i in line.split(',')]
    #create a new list if one does not already exist for this word
    if not words.get(word):
        words[word] = []
    #add this CountByYear object to corresponding list in the dictionary
    words[word].append(CountByYear(word,year,freq))
print(words)

示例输入文件中上述代码的输出为:

{'gorilla': [CountByYear(year=2010, count=120302), CountByYear(year=2008, count=89323)], 'aardvark': [CountByYear(year=2007, count=123948), CountByYear(year=2008, count=120423), CountByYear(year=2004, count=96323)], 'raptorjesus': [CountByYear(year=1996, count=214)]}

答案 1 :(得分:0)

一种方法是使用defaultdict。例如,

from collections import defaultdict

words = defaultdict(list)

with open("data.csv", "r") as f:
    for line in f.readlines():
        key_name, year, count = line.rstrip().split(',')
        words[key_name] += [year, count]
        # or  words[key_name] += CountByYear(year, count) or similar

print(words)   

答案 2 :(得分:0)

尝试使用csv模块(https://docs.python.org/3.4/library/csv.html)和

之类的内容
import csv

words = {}
with open('eggs.csv', newline='') as csvfile:
    reader = csv.reader(csvfile, delimiter=' ', quotechar='|')

    for word, year, count in reader:
        words[word] = words.get(word, []) + [CountByYear(word, year, count)]

print(words)