从两个csv文件创建一个嵌套字典

时间:2017-03-05 04:41:50

标签: python csv dictionary nested key-value

我有两个csv文件
file1.csv:

ID,map1,map2  
a,x1,x2  
b,y1,  
c,z1,z2  

file2.csv:

ID,map1Val1,map1Val2,map2Val1
a,a1,a2,l1
b,b1,b2,
c,c1,c2,n1

我希望输出看起来像:

{'ID': {'map1':['map1Val1','map1Val2'], 'map2':'map2Val1'},'a': {'x1':['a1','a2'], 'x2':'l1'},'b': {'y1':['b1','b2']},'c': {'z1':['c1','c2'], 'z2':'n1'},}  

我想不出任何创造这个的方法。到目前为止,我只有一个代码来从一个csv文件创建一个字典:

import csv
new_data_dict = {}
with open("file1.csv", 'r') as map_file:
    mapping = csv.DictReader(map_file, delimiter=",")
    for row in mapping:
        new_data_dict= {row[0]:{row[1],row[2]}}
print new_data_dict

输出:

{"ID":{map1,map2}, "a":{x1,x2}, "b":{y1}, "a":{z1,z2}}

2 个答案:

答案 0 :(得分:1)

您可以使用zip聚合来自两个csv文件的行:

>>> list(zip([1,2,3], [4,5,6]))   # assume 1, 2, 3 /  4, 5, 6 as row values
[(1, 4), (2, 5), (3, 6)]
import csv

new_data_dict = {}
with open('file1.csv') as f1, open('file2.csv') as f2:
    reader1, reader2 = csv.reader(f1), csv.reader(f2)
    for row1, row2 in zip(reader1, reader2):
        id_, map1, map2 = row1
        new_data_dict[id_] = {map1: row2[1:3]}
        map2 = map2.strip()
        if map2:  # put map2 only if map2 key exists
            new_data_dict[id_][map2] = row2[3]

new_data_dict变为:

{'ID': {'map1': ['map1Val1', 'map1Val2'], 'map2': 'map2Val2'},
 'a': {'x1': ['a1', 'a2'], 'x2': 'l1'},
 'b': {'y1': ['b1', 'b2']},
 'c': {'z1': ['c1', 'c2'], 'z2  ': 'n1'}}

答案 1 :(得分:1)

这是一个更加动态的解决方案,可让您预先配置[1,2,1,2,1],1 => false [1,2,1,2,1],2 => true [1,2,1,2,1],3 => true [2,2,2],1 => false [2,2,2],2 => true [2,2,2],3 => true [2,2,2],4 => true 中哪些列映射到file1的哪些列:

file2

使用import csv = {'map1': ['map1Val1', 'map1Val2'], 'map2': ['map2Val1'] } joined_data = dict() joined_data['ID'] = column_map with open("file1.txt") as f1, open("file2.txt") as f2: key_list = list(csv.DictReader(f1)) value_list = list(csv.DictReader(f2)) for kl, vl in zip(key_list, value_list): inner = {} for key, value_list in column_map.items(): if kl[key]: inner[kl[key]] = [vl[el] for el in value_list] joined_data[kl['ID']] = inner 可以让我们将每行的数据映射到csv.DictReader,其密钥(默认情况下)由文件的第一行给出。这两个dict对象被强制转换为列表并使用DictReader进行迭代。我们使用zip作为指南,创建了一个新的column_map字典,将来自inner的密钥与来自key_list的值相关联。

修改

对于全动态解决方案,您可以通过将value_list中的列标题与column_map

中的列标题进行比较,动态创建file1
file2