我有一个csv文件,其中包含教职人员的详细信息。几个成员的姓氏相同。我正在尝试创建一个以姓氏为键,其他细节为值的字典。数据类型如下所示:
name,degree,title
S.li,phd,Associate Professor of Biostats
d.Chiou,MD, Professor ofABC
F.Li,MPH Professor of DCD
我正在尝试按以下方式获得字典;
mydict={"Li":[[' phd.', 'Associate Professor of Biostats'], ['MPH','Professor of DCD']] 'Chiou': [[' MD', 'Professor of ABC']]}
我使用了下面的代码,它不起作用。
reader = csv.reader(open('faculty.csv'))
mydict = {}
for rows in reader:
k = rows[0]
v = rows[1:]
mydict[k] = v
print (mydict)
我也尝试了以下代码:
reader = csv.reader(open('faculty.csv'))
mydict = {rows[0]:rows[1:] for rows in reader}
print (mydict)
答案 0 :(得分:0)
您的code
和demo data
有2个问题:
您可以在代码中同时处理这两种情况。我假设您的“姓氏”始终是第一列中最后一个点之后的最后一个字符串,之前的所有内容都是姓氏。
使用重复条目(Li vs Li vs li)创建演示数据文件
with open("faculty.csv","w") as f:
f.write("""name,degree,title
S.li,phd,Associate Professor of Biostats
d.Chiou,MD, Professor of ABC
F.Li,MPH, Professor of DCD
K.Li,MPH Professor of XYZ
""")
将csv文件处理为字典
import csv
# process data
mydict = {}
with open('faculty.csv') as r:
reader = csv.reader(r)
# skip header row
next(reader, None)
# process data rows
for rows in reader:
k = rows[0]
v = rows[1:]
# Name has . in it:
if '.' in k:
# all before last . is surname, after last . is lastName
# we add surnames to the data, use only lastName as key
lastName = k.split('.')[-1]
surName = k[:-len(lastName)] # strip key from name part
v.append(surName) # add surname-parts to data
else:
lastName = k # no surnames
# create/get key in/from dict if needed, prepopulate value with empty list
key = mydict.setdefault(lastName,[])
# append data
key.append(v)
print (mydict)
输出(格式化):
{'Chiou': [['MD', ' Professor of ABC', 'd.']],
'Li': [['MPH', ' Professor of DCD', 'F.'], ['MPH Professor of XYZ', 'K.']],
'li': [['phd', 'Associate Professor of Biostats', 'S.']]}
如果您的数据有问题,您可以考虑使用密钥上的.capitalize()
或.title()
来固定名称:
name = "one naMe"
print(name[0].upper()+name[1:])
print(name.capitalize())
print(name.title())
输出:
One naMe # name[0].upper()+name[1:]
One name # .capitalize()
One Name # .title()