我没有做过很多Python编程,我试图在基本的csv中读取,然后从中创建一个嵌套的字典。这是我到目前为止所有,我似乎有一些循环或覆盖我的字典的问题。我知道它效率不高。
import csv
reader = csv.DictReader(open("fruit.csv"))
fruit_dict = {}
color_dict = {}
for row in reader:
info_list = []
count = row.pop('count')
info_list.append(count)
year = row.pop('year')
info_list.append(year)
info = row.pop('info')
info_list.append(info)
if row['color'] not in color_dict:
#print row['color']
color_dict['color'] = row['color']
#print fruit_dict
if row['fruit'] not in fruit_dict:
fruit_dict['name'] = row['fruit']
#print fruit_dict
#print info_list
list_of_info_lists =[]
list_of_info_lists.append(info_list)
fruit_dict['fruitInfo'] = list_of_info_lists
color_dict['fruit'] = fruit_dict
#print color_dict
else:
list_of_info_lists.append(info_list)
fruit_dict['fruitInfo'] = list_of_info_lists
color_dict['fruit'] = fruit_dict
#print color_dict
else:
if row['color'] in color_dict:
if row['fruit'] not in fruit_dict:
fruit_dict['name'] = row['fruit']
#print fruit_dict
#print info_list
list_of_info_lists =[]
list_of_info_lists.append(info_list)
fruit_dict['fruitInfo'] = list_of_info_lists
color_dict['fruit'] = fruit_dict
#print color_dict
else:
list_of_info_lists.append(info_list)
fruit_dict['fruitInfo'] = list_of_info_lists
color_dict['fruit'] = fruit_dict
#print color_dict
#print color_dict
这是csv:
color,fruit,year,count,info
red,apple,1970,3,good
red,apple,1922,5,okay
orange,orange,1935,2,okay
green,celery,2001,22,marginal
red,cherries,1999,5,outstanding
orange,carrot,1952,7,okay
green,celery,2014,2,good
green,grapes,2001,12,good
我得到的是:
{'color': 'green', 'fruit': {'name': 'grapes', 'fruitInfo': [['12', '2001', 'good']]}}
这很可爱,除了我期待比这更多的一行,并期待一个列表列表的名称'已存在,例如:
{'color': 'red', 'fruit': {'name': 'apple', 'fruitInfo': [['5', '1922', 'okay'],['3', '1970', 'good']]}}
任何建议都将不胜感激。最终的目标是生成一个json文件。
谢谢, 苏珊
这是我最后想要的格式:
[{'color': 'red', 'fruit': {'name': 'apple', 'fruitInfo': [['5', '1922', 'okay'],['3', '1970', 'good']]}},
{'color': 'red', 'fruit': {'name': 'cherries', 'fruitInfo': [['5', '1999', 'outstanding']]}},
{'color': 'orange', 'fruit': {'name': 'orange', 'fruitInfo': [['2', '1935', 'okay']]}},
{'color': 'orange', 'fruit': {'name': 'carrot', 'fruitInfo': [['7', '1952', 'okay']]}},
{'color': 'green', 'fruit': {'name': 'celery', 'fruitInfo': [['2', '2014', 'good'],['22', '2001', 'marginal']]}},
{'color': 'green', 'fruit': {'name': 'grapes', 'fruitInfo': [['12', '2001', 'good']]}}]
答案 0 :(得分:2)
Jon Clements的回答是最佳解决方案。如果您想要了解最初开始帮助您了解可能出错的地方,请查看以下内容:
results_list = []
colorFruitTuple_set = set()
for row in reader:
info_list = [row['count'], row['year'],row['info']]
if (row['color'], row['fruit']) not in colorFruitTuple_set:
color_dict = {}
fruit_dict = {}
color_dict['color'] = row['color']
fruit_dict['name'] = row['fruit']
list_of_info_lists = [info_list]
fruit_dict['fruitInfo'] = list_of_info_lists
color_dict['fruit'] = fruit_dict
results_list.append(color_dict)
colorFruitTuple_set.add((row['color'], row['fruit']))
else:
for color_dict in results_list:
if color_dict["color"] == row['color'] and color_dict["fruit"]["name"] == row["fruit"]:
color_dict["fruit"]["fruitInfo"].append(info_list)
我认为这与你的目标一致。当您需要创建多个时,您尝试使用相同的color_dict和fruit_dict - 这也意味着您无法使用它们来跟踪重复项。这仅仅是为了学习目的 - Jon的方式是正确的方法。
希望这有帮助!
答案 1 :(得分:2)
您可以在此处使用defaultdict
列表,将fruitInfo
和2元组作为您的密钥(颜色和水果),然后重新格式化,例如:
import csv
from collections import defaultdict
dd = defaultdict(list)
with open('yourfile.csv') as fin:
csvin = csv.DictReader(fin)
for row in csvin:
dd[row['color'], row['fruit']].append([row['count'], row['year'], row['info']])
然后使用:
稍微重新格式化dd
reformatted = [{'color': c, 'fruit': {'name': f, 'fruitInfo': v}} for (c, f), v in dd.items()]
给你:
[{'color': 'orange',
'fruit': {'fruitInfo': [['7', '1952', 'okay']], 'name': 'carrot'}},
{'color': 'green',
'fruit': {'fruitInfo': [['12', '2001', 'good']], 'name': 'grapes'}},
{'color': 'orange',
'fruit': {'fruitInfo': [['2', '1935', 'okay']], 'name': 'orange'}},
{'color': 'red',
'fruit': {'fruitInfo': [['3', '1970', 'good'], ['5', '1922', 'okay']],
'name': 'apple'}},
{'color': 'red',
'fruit': {'fruitInfo': [['5', '1999', 'outstanding']], 'name': 'cherries'}},
{'color': 'green',
'fruit': {'fruitInfo': [['22', '2001', 'marginal'], ['2', '2014', 'good']],
'name': 'celery'}}]
答案 2 :(得分:0)
在处理字典词典时,我的模式是这样的:
sub_dict = main_dict.get(key, {})
sub_dict[sub_key] = sub_value
main_dict[key] = sub_dict
这会获取子词典,如果它不存在,则为{}
。然后它为子字典赋值,并将子字典放回主字典中。
fruit_dict = {}
for row in reader:
# make the info_list
info_list = [row['count'], row['year'], row['info']]
# extract color and fruit into variables
color = row['color']
fruit = row['fruit']
# unpack the dictionaries and list
colors = fruit_dict.get(color, {})
fruits = colors.get(fruit, {})
info = fruits.get('info', [])
# reassemble the list and dictionaries
info.append(info_list)
fruits['info'] = info
colors[fruit] = fruits
fruit_dict[color] = colors
结果与您的示例略有不同,但需要更改它以使用颜色和水果作为键。
{' orange':{' orange':{' info':[[' 2',' 1935&# 39;,'好的']]},'胡萝卜':{' info':[[' 7',' 1952& #39;,'好的']]}},'绿色':{' celery':{' info':[[&# 39; 22',' 2001',' marginal'],[' 2',' 2014',' good& #39;]]},' grape':{' info':[[' 12',' 2001','好的']]}},' red':{' cherries':{' info':[[' 5',& #39; 1999','杰出']]},' apple':{' info':[[' 3', ' 1970',' good'],[' 5',' 1922','好的']]}} }