对于应用程序,我需要在Python中将CSV数据文件转换为嵌套JSON。我下面的当前Python代码对于1个Customer / Accounts文档工作正常,但是以某种方式无法为CSV文件中的所有客户创建json转储。
我在下面提供了Python代码,这将使您对我要实现的目标有一些了解。请让我知道是否有任何现有的解决方案。
示例Python代码:
import pandas as pd
from itertools import groupby
from collections import OrderedDict
import json
df = pd.read_csv('cust.csv', dtype={
"ClientID" : str,
"ClientName" : str,
"AcctID" : str,
"AcctNbr" : str,
"AcctTyp" : str
})
results = []
for (ClientID, ClientName), bag in df.groupby(["ClientID", "ClientName"]):
contents_df = bag.drop(["ClientID", "ClientName"], axis=1)
subset = [OrderedDict(row) for i,row in contents_df.iterrows()]
results.append(OrderedDict([("ClientID", ClientID),("ClientName", ClientName),("subset", subset)]))
print json.dumps(results[0], indent=4)
with open('ExpectedJsonFile.json', 'w') as outfile:
outfile.write(json.dumps(results[0], indent=4))
输入CSV样本:
ClientID,ClientName,AcctID,AcctNbr,AcctTyp
----------------------------------------------------------
00001,John George,812001,812001095,DDA
00001,John George,813002,813002096,SAV
00001,John George,814003,814003097,AFS
00024,Richard Polado,512987,512987085,ML
00024,Richard Polado,512983,512983086,IL
00345,John Cruze,1230,123001567,SAV
00345,John Cruze,5145,514502096,CD
00345,John Cruze,7890,7890033527,SGD
所需的输出JSON:
{
"clientId":00001,
"ClientName":"John George",
"subset":[
{
"AcctID":812001,
"AcctNbr":"812001095",
"AcctTyp":"DDA",
},
{
"AcctID":813002,
"AcctNbr":"813002096",
"AcctTyp":"SAV",
},
{
"AcctID":814003,
"AcctNbr":"814003097",
"AcctTyp":"AFS",
}
]
},
{
"clientId":00024,
"ClientName":"Richard Polado",
"subset":[
{
"AcctID":512987,
"AcctNbr":"512987085",
"AcctTyp":"ML",
},
{
"AcctID":512983,
"AcctNbr":"512983086",
"AcctTyp":"IL",
}
]
}
,这些文档应继续为其他数千个客户创建。
答案 0 :(得分:2)
解决方案按每对'ClientID','ClientName'
对分组
您的DataFrame
df = pd.DataFrame([['00001','John George','812001','812001095','DDA'],
['00001','John George','813002','813002096','SAV'],
['00001','John George','814003','814003097','AFS'],
['00024','Richard Polado','512987','512987085','ML'],
['00024','Richard Polado','512983','512983086','IL'],
['00345','John Cruze','1230','123001567','SAV'],
['00345','John Cruze','5145','514502096','CD'],
['00345','John Cruze','7890','7890033527','SGD']])
df.columns = ['ClientID','ClientName','AcctID','AcctNbr','AcctTyp']
现在
finalList = []
finalDict = {}
grouped = df.groupby(['ClientID', 'ClientName'])
for key, value in grouped:
dictionary = {}
j = grouped.get_group(key).reset_index(drop=True)
dictionary['ClientID'] = j.at[0, 'ClientID']
dictionary['ClientName'] = j.at[0, 'ClientName']
dictList = []
anotherDict = {}
for i in j.index:
anotherDict['AcctID'] = j.at[i, 'AcctID']
anotherDict['AcctNbr'] = j.at[i, 'AcctNbr']
anotherDict['AcctTyp'] = j.at[i, 'AcctTyp']
dictList.append(anotherDict)
dictionary['subset'] = dictList
finalList.append(dictionary)
import json
json.dumps(finalList)
给予:
'[
{"ClientID": "00001",
"ClientName": "John George",
"subset":
[{"AcctID": "814003",
"AcctNbr": "814003097",
"AcctTyp": "AFS"},
{"AcctID": "814003",
"AcctNbr": "814003097",
"AcctTyp": "AFS"},
{"AcctID": "814003",
"AcctNbr": "814003097",
"AcctTyp": "AFS"}]
},
{
"ClientID": "00024",
"ClientName": "Richard Polado",
"subset":
[{"AcctID": "512983",
"AcctNbr": "512983086",
"AcctTyp": "IL"},
{"AcctID": "512983",
"AcctNbr": "512983086",
"AcctTyp": "IL"}]
},
{
"ClientID": "00345",
"ClientName": "John Cruze",
"subset":
[{"AcctID": "7890",
"AcctNbr": "7890033527",
"AcctTyp": "SGD"},
{"AcctID": "7890",
"AcctNbr": "7890033527",
"AcctTyp": "SGD"},
{"AcctID": "7890",
"AcctNbr": "7890033527",
"AcctTyp": "SGD"}]
}
]'
这似乎是您想要的吗?
答案 1 :(得分:1)
使用dictList.append(anotherDict.copy())
,否则您将在列表中获得相同的dict对象。
此问题的更多详细信息: Create List of Dictionary Python