这是我输入的具有多列的CSV文件,我想将此csv文件转换为具有department,departmentID和一个称为客户的嵌套字段的json文件,并首先嵌套最后一个嵌套到此字段。
department, departmentID, first, last
fans, 1, Caroline, Smith
fans, 1, Jenny, White
students, 2, Ben, CJ
students, 2, Joan, Carpenter
...
输出我需要的json文件:
[
{
"department" : "fans",
"departmentID: "1",
"customer" : [
{
"first" : "Caroline",
"last" : "Smith"
},
{
"first" : "Jenny",
"last" : "White"
}
]
},
{
"department" : "students",
"departmentID":2,
"user" :
[
{
"first" : "Ben",
"last" : "CJ"
},
{
"first" : "Joan",
"last" : "Carpenter"
}
]
}
]
我的代码:
from csv import DictReader
from itertools import groupby
with open('data.csv') as csvfile:
r = DictReader(csvfile, skipinitialspace=True)
data = [dict(d) for d in r]
groups = []
uniquekeys = []
for k, g in groupby(data, lambda r: (r['group'], r['groupID'])):
groups.append({
"group": k[0],
"groupID": k[1],
"user": [{k:v for k, v in d.items() if k != 'group'} for d in list(g)]
})
uniquekeys.append(k)
pprint(groups)
我的问题是:groupID在数据中显示两次,进出嵌套json。我想要的是group和groupID作为grouby键。
答案 0 :(得分:0)
问题是您混合了按键的名称,所以这一行
"user": [{k:v for k, v in d.items() if k != 'group'} for d in list(g)]
没有从字典中正确删除它们,没有这样的密钥。所以什么也没有删除。
我不完全了解您想要什么键,因此下面的示例假定data.csv
看起来像您的问题department
和departmentID
一样,但是脚本将其转换为{{1} }和group
groupID
输出:
from csv import DictReader
from itertools import groupby
from pprint import pprint
with open('data.csv') as csvfile:
r = DictReader(csvfile, skipinitialspace=True)
data = [dict(d) for d in r]
groups = []
uniquekeys = []
for k, g in groupby(data, lambda r: (r['department'], r['departmentID'])):
groups.append({
"group": k[0],
"groupID": k[1],
"user": [{k:v for k, v in d.items() if k not in ['department','departmentID']} for d in list(g)]
})
uniquekeys.append(k)
pprint(groups)
我使用了不同的键,因此很明显,哪一行在做什么,并且很容易针对输入或输出中的不同键对其进行自定义