Question

我有一个csv文件，其中包含以下列：

Item Name, Item Type, Manufacturer Name

我需要编写一个创建字典的函数，其中的键是Item Type列中的短语，值是该短语的出现次数，然后我需要打印该词典。

据我所知，它添加了Item Type作为键，但是在存储关联值时遇到了问题。

这是csv内容：

Item Name, Item Type, Manufacturer Name
Elektra Clone,Fuzzstortion,ollieMAX! Pedals
Sputnik II,Fuzz,Spaceman Pedals
Pumpkin Pi,Fuzz,Green Carrot Pedals
Carcosa,Fuzz,DOD
Big Muff Pi (Black Russian),Fuzz,Electro-Harmonix
Octopuss,Passive Octave Up,Bigfoot Engineering
Small Stone,Phaser,Electro-Harmonix
Grand Orbiter,Phaser,Earthquaker Devices
Hummingbird,Tremolo,Earthquaker Devices
Echosystem,Digital Delay,Empress Effects
Freeze,Sound Retainer,Electro-Harmonix
Ditto,Looper,TC Electronic
Stamme[n],Glitch Delay,Drolo

这是我的代码：

def countItemTypes(fileName):
    #create an empty dictionary as we need to store key/value pairs
    itemDic = {}
    # where fileName is the name of the csv file
    #first we must open the csv file and read it
    import csv
    with open(fileName, "r") as itemFile:
    #we are using itemFile as the handle
        csvReader = csv.reader(itemFile, delimiter=",", quotechar='"')
        #skip the header because we don't need to do anything with it
        next(csvReader)
        #now that we have skipped the header we need to iterate through the rows
        for row in csvReader:
            #troubleshooting diagnostic, for loop:
            #print(row)
            #now we need to take the second column entry of the csv and assign that as the key
            #and the total number of its instances as the value to that key
            #quite frankly I have no idea how to do that.
            if itemDic[row[1]] not in itemDic:
                itemDic[row[1]] = 1
            else:
                itemDic[row[1]] += 1
        #print the new dictionary
        print (itemDic)

它碰到KeyError: 'Fuzzstortion'时会碰到：

if itemDic[row[1]] not in itemDic:
    itemDic[row[1]] = 1
else:
    itemDic[row[1]] += 1

Answer 1

您的if条件存在的问题是您实际上要检查此问题

# Check row[1] not in the dictionary
if row[1] not in itemDic:
    itemDic[row[1]] = 1
else:
    itemDic[row[1]] += 1

Answer 2

使用`collections.defaultdict`：

defaultdict，如果来自collections模块，并且是dict的子类。它类似于dict，不同的是，defaultdict将为新的key设置默认值。 defaultdict无需先检查key是否存在并设置值。
将itemDic = {}替换为itemDic = defaultdict(int)
- itemDic = Counter()也可以工作
- Raulillo
将if-else部分替换为itemDic[row[1]] += 1
下面实现的代码

from collections import defaultdict, Counter  # pick which one you want to use
import csv


def countItemTypes(fileName: str) -> defaultdict:
    """
    Parse a csv file and return a dict with the word count
    of the second column, Item Type
    fileName: Name of csv file to parse
    """
    # create empty defauldict
    itemDic = defaultdict(int)  # or you can use itemDic = Counter()
    # open fileName
    with open(fileName, "r") as itemFile:
    #we are using itemFile as the handle
        csvReader = csv.reader(itemFile, delimiter=",", quotechar='"')
        #skip the header
        next(csvReader)
        # iterate through the rows
        for row in csvReader:
            # assign word from second column as a key and count occurrences
            itemDic[row[1]] += 1
        #return the new dictionary
        return itemDic

现在没有if-else，因此代码已简化。
该函数还包括以下内容：
- """description of the function"""：文档字符串collections.Counter
- Documenting Python Code: A Complete Guide

用法：

word_count = countItemTypes('test.csv')
print(word_count)

>>>
defaultdict(int,
            {'Fuzzstortion': 1,
             'Fuzz': 4,
             'Passive Octave Up': 1,
             'Phaser': 2,
             'Tremolo': 1,
             'Digital Delay': 1,
             'Sound Retainer': 1,
             'Looper': 1,
             'Glitch Delay': 1})

Answer 3

您在 if声明中提出了错误的问题：

itemDic[row[1]] not in itemDic

itemDic存储对类型重复。

您在问什么：

字典中是否没有第[1]行中类型的重复？
您要问的问题：

字典中是否没有第[1]行中的类型？

row[1] not in itemDic

请注意，请尝试将所有导入文件放在文件的开头，这样更清晰易读。

如何创建以csv列内容为键，出现次数为值的字典？

3 个答案:

使用`collections.defaultdict`：

用法：

如何创建以csv列内容为键，出现次数为值的字典？

3 个答案:

使用collections.defaultdict：

用法：

使用`collections.defaultdict`：