如何创建以csv列内容为键,出现次数为值的字典?

时间:2019-09-09 16:55:01

标签: python csv dictionary

我有一个csv文件,其中包含以下列:

Item Name, Item Type, Manufacturer Name

我需要编写一个创建字典的函数,其中的键是Item Type列中的短语,值是该短语的出现次数,然后我需要打印该词典。

据我所知,它添加了Item Type作为键,但是在存储关联值时遇到了问题。

这是csv内容:

Item Name, Item Type, Manufacturer Name
Elektra Clone,Fuzzstortion,ollieMAX! Pedals
Sputnik II,Fuzz,Spaceman Pedals
Pumpkin Pi,Fuzz,Green Carrot Pedals
Carcosa,Fuzz,DOD
Big Muff Pi (Black Russian),Fuzz,Electro-Harmonix
Octopuss,Passive Octave Up,Bigfoot Engineering
Small Stone,Phaser,Electro-Harmonix
Grand Orbiter,Phaser,Earthquaker Devices
Hummingbird,Tremolo,Earthquaker Devices
Echosystem,Digital Delay,Empress Effects
Freeze,Sound Retainer,Electro-Harmonix
Ditto,Looper,TC Electronic
Stamme[n],Glitch Delay,Drolo

这是我的代码:

def countItemTypes(fileName):
    #create an empty dictionary as we need to store key/value pairs
    itemDic = {}
    # where fileName is the name of the csv file
    #first we must open the csv file and read it
    import csv
    with open(fileName, "r") as itemFile:
    #we are using itemFile as the handle
        csvReader = csv.reader(itemFile, delimiter=",", quotechar='"')
        #skip the header because we don't need to do anything with it
        next(csvReader)
        #now that we have skipped the header we need to iterate through the rows
        for row in csvReader:
            #troubleshooting diagnostic, for loop:
            #print(row)
            #now we need to take the second column entry of the csv and assign that as the key
            #and the total number of its instances as the value to that key
            #quite frankly I have no idea how to do that.
            if itemDic[row[1]] not in itemDic:
                itemDic[row[1]] = 1
            else:
                itemDic[row[1]] += 1
        #print the new dictionary
        print (itemDic)

它碰到KeyError: 'Fuzzstortion'时会碰到:

if itemDic[row[1]] not in itemDic:
    itemDic[row[1]] = 1
else:
    itemDic[row[1]] += 1

3 个答案:

答案 0 :(得分:1)

您的if条件存在的问题是您实际上要检查此问题

# Check row[1] not in the dictionary
if row[1] not in itemDic:
    itemDic[row[1]] = 1
else:
    itemDic[row[1]] += 1

答案 1 :(得分:0)

使用collections.defaultdict

  • defaultdict,如果来自collections模块,并且是dict的子类。它类似于dict,不同的是,defaultdict将为新的key设置默认值。 defaultdict无需先检查key是否存在并设置值。
  • itemDic = {}替换为itemDic = defaultdict(int)
      根据{{​​3}} ,
    • itemDic = Counter()也可以工作
    • Raulillo
  • if-else部分替换为itemDic[row[1]] += 1
  • 下面实现的代码
from collections import defaultdict, Counter  # pick which one you want to use
import csv


def countItemTypes(fileName: str) -> defaultdict:
    """
    Parse a csv file and return a dict with the word count
    of the second column, Item Type
    fileName: Name of csv file to parse
    """
    # create empty defauldict
    itemDic = defaultdict(int)  # or you can use itemDic = Counter()
    # open fileName
    with open(fileName, "r") as itemFile:
    #we are using itemFile as the handle
        csvReader = csv.reader(itemFile, delimiter=",", quotechar='"')
        #skip the header
        next(csvReader)
        # iterate through the rows
        for row in csvReader:
            # assign word from second column as a key and count occurrences
            itemDic[row[1]] += 1
        #return the new dictionary
        return itemDic

用法:

word_count = countItemTypes('test.csv')
print(word_count)

>>>
defaultdict(int,
            {'Fuzzstortion': 1,
             'Fuzz': 4,
             'Passive Octave Up': 1,
             'Phaser': 2,
             'Tremolo': 1,
             'Digital Delay': 1,
             'Sound Retainer': 1,
             'Looper': 1,
             'Glitch Delay': 1})

答案 2 :(得分:0)

您在 if声明中提出了错误的问题

itemDic[row[1]] not in itemDic

itemDic存储对类型重复。

  • 您在问什么:

    字典中是否没有第[1]行中类型的重复

  • 您要问的问题:

    字典中是否没有第[1]行中的类型

    row[1] not in itemDic

请注意,请尝试将所有导入文件放在文件的开头,这样更清晰易读。