使用特定键从csv文件创建字典

时间:2018-08-24 21:18:15

标签: python python-3.x dictionary

我有一个csv文件,其中包含教职人员的详细信息。几个成员的姓氏相同。我正在尝试创建一个以姓氏为键,其他细节为值的字典。数据类型如下所示:

name,degree,title
S.li,phd,Associate Professor of Biostats
d.Chiou,MD, Professor ofABC
F.Li,MPH Professor of DCD

我正在尝试按以下方式获得字典;

mydict={"Li":[[' phd.', 'Associate Professor of Biostats'], ['MPH','Professor of DCD']] 'Chiou': [[' MD', 'Professor of ABC']]}

我使用了下面的代码,它不起作用。

reader = csv.reader(open('faculty.csv'))  
mydict = {}  
for rows in reader:  
    k = rows[0]  
    v = rows[1:]  
    mydict[k] = v
print (mydict)

我也尝试了以下代码:

    reader = csv.reader(open('faculty.csv'))
    mydict = {rows[0]:rows[1:] for rows in reader}
    print (mydict)

1 个答案:

答案 0 :(得分:0)

您的codedemo data有2个问题:

  1. 您的姓氏大小写不同-因此用作键时,它们不同
  2. 您的名字以姓氏为前缀-您不会从中分离姓氏部分。

您可以在代码中同时处理这两种情况。我假设您的“姓氏”始终是第一列中最后一个点之后的最后一个字符串,之前的所有内容都是姓氏。


使用重复条目(Li vs Li vs li)创建演示数据文件

with open("faculty.csv","w") as f:
    f.write("""name,degree,title
S.li,phd,Associate Professor of Biostats
d.Chiou,MD, Professor of ABC
F.Li,MPH, Professor of DCD
K.Li,MPH Professor of XYZ
""")

将csv文件处理为字典

import csv

# process data
mydict = {}

with open('faculty.csv') as r:
    reader = csv.reader(r)  
    # skip header row
    next(reader, None) 
    # process data rows
    for rows in reader:   
        k = rows[0]
        v = rows[1:]  

        # Name has . in it: 
        if '.' in k:
            # all before last . is surname, after last . is lastName
            # we add surnames to the data, use only lastName as key
            lastName = k.split('.')[-1]
            surName = k[:-len(lastName)] # strip key from name part
            v.append(surName)            # add surname-parts to data
        else:
            lastName = k                 # no surnames

        # create/get key in/from dict if needed, prepopulate value with empty list
        key = mydict.setdefault(lastName,[])
        # append data
        key.append(v)

print (mydict)

输出(格式化):

{'Chiou': [['MD', ' Professor of ABC', 'd.']], 
 'Li':    [['MPH', ' Professor of DCD', 'F.'], ['MPH Professor of XYZ', 'K.']], 
 'li':    [['phd', 'Associate Professor of Biostats', 'S.']]}

如果您的数据有问题,您可以考虑使用密钥上的.capitalize().title()来固定名称:

name = "one naMe"

print(name[0].upper()+name[1:])
print(name.capitalize())
print(name.title())

输出:

One naMe     # name[0].upper()+name[1:]
One name     # .capitalize()
One Name     # .title()