我正在尝试构建JSON布局。我正在从输入文件中读取所有这些记录。该文件中可能有多个具有相同键(Id)的记录。
示例输入文件:
Id,LineNo,Amt,ReceivedDt,FromDt,ToDate,regionId
123545,1,1000.00,2019-02-01T00:00:00,2019-02-01T00:00:00,2019-02-01T00:00:00,WA12
123545,2,200.00,2019-02-01T00:00:00,2019-02-01T00:00:00,2019-02-01T00:00:00,WA12
123545,3,200.00,2019-02-01T00:00:00,2019-02-01T00:00:00,2019-02-01T00:00:00,WA12
123546,1,200.00,2019-02-01T00:00:00,2019-02-01T00:00:00,2019-02-01T00:00:00,WA13
123546,2,200.00,2019-02-01T00:00:00,2019-02-01T00:00:00,2019-02-01T00:00:00,WA13
我的逻辑是以字典格式从文件中读取记录并保持附加 直到相同的键(Id)匹配。如果键不再匹配,请删除列表并追加新键,然后将记录与此新键进行比较。在这两者之间,需要存储结果,以免丢失以前处理过的记录。 (这是我无法弄清楚的)。
代码:
import json,csv
with open('Test.csv') as f:
inputfile = csv.DictReader(f)
output = []
key =1
for row in inputfile :
if len(output)==0:
output.append(row)
elif len(output)>0:
if row['Id']==key:
output.append(row)
else:
del output[:]
output.append(row)
key=row['Id']
data = json.dumps({"data":output}, indent=4)
print(data)
输出:
由于最后一组被删除,只有最后两行会出现。
请建议如何存储这些行。
{
"data": [
{
"ToDate": "2019-02-01T00:00:00",
"ReceivedDt": "2019-02-01T00:00:00",
"regionId": "WA13",
"Id": "123546",
"LineNo": "1",
"Amt": "200.00",
"FromDt": "2019-02-01T00:00:00"
},
{
"ToDate": "2019-02-01T00:00:00",
"ReceivedDt": "2019-02-01T00:00:00",
"regionId": "WA13",
"Id": "123546",
"LineNo": "2",
"Amt": "200.00",
"FromDt": "2019-02-01T00:00:00"
}
]
}
所需的输出:
{
"data": [
{
"ToDate": "2019-02-01T00:00:00",
"ReceivedDt": "2019-02-01T00:00:00",
"regionId": "WA12",
"Id": "123545",
"LineNo": "1",
"Amt": "1000.00",
"FromDt": "2019-02-01T00:00:00"
},
{
"ToDate": "2019-02-01T00:00:00",
"ReceivedDt": "2019-02-01T00:00:00",
"regionId": "WA12",
"Id": "123545",
"LineNo": "2",
"Amt": "200.00",
"FromDt": "2019-02-01T00:00:00"
},
{
"ToDate": "2019-02-01T00:00:00",
"ReceivedDt": "2019-02-01T00:00:00",
"regionId": "WA12",
"Id": "123545",
"LineNo": "3",
"Amt": "200.00",
"FromDt": "2019-02-01T00:00:00"
}
]
},
{
"data": [
{
"ToDate": "2019-02-01T00:00:00",
"ReceivedDt": "2019-02-01T00:00:00",
"regionId": "WA13",
"Id": "123546",
"LineNo": "1",
"Amt": "200.00",
"FromDt": "2019-02-01T00:00:00"
},
{
"ToDate": "2019-02-01T00:00:00",
"ReceivedDt": "2019-02-01T00:00:00",
"regionId": "WA13",
"Id": "123546",
"LineNo": "2",
"Amt": "200.00",
"FromDt": "2019-02-01T00:00:00"
}
]
}
答案 0 :(得分:0)
使用>>> type('Ho', (), {'__module__': 'hey'})
<class 'hey.Ho'>
:
itertools.groupby
答案 1 :(得分:0)
尽管不像使用itertools.groupby
那样简洁明了,但这是一种可以手动跟踪具有相同Id
的数据组的方法:
import csv
import json
with open('Test.csv') as f:
output = []
data = []
key = None
for row in csv.DictReader(f):
if row['Id'] == key:
data.append(row)
else:
if data:
output.append({"data": data})
data = []
data.append(row)
key = row['Id']
if data: # A final group?
output.append({"data": data})
print('output:\n', json.dumps(output, indent=4))
输出:
output:
[
{
"data": [
{
"Id": "123545",
"LineNo": "1",
"Amt": "1000.00",
"ReceivedDt": "2019-02-01T00:00:00",
"FromDt": "2019-02-01T00:00:00",
"ToDate": "2019-02-01T00:00:00",
"regionId": "WA12"
},
{
"Id": "123545",
"LineNo": "2",
"Amt": "200.00",
"ReceivedDt": "2019-02-01T00:00:00",
"FromDt": "2019-02-01T00:00:00",
"ToDate": "2019-02-01T00:00:00",
"regionId": "WA12"
},
{
"Id": "123545",
"LineNo": "3",
"Amt": "200.00",
"ReceivedDt": "2019-02-01T00:00:00",
"FromDt": "2019-02-01T00:00:00",
"ToDate": "2019-02-01T00:00:00",
"regionId": "WA12"
}
]
},
{
"data": [
{
"Id": "123546",
"LineNo": "1",
"Amt": "200.00",
"ReceivedDt": "2019-02-01T00:00:00",
"FromDt": "2019-02-01T00:00:00",
"ToDate": "2019-02-01T00:00:00",
"regionId": "WA13"
},
{
"Id": "123546",
"LineNo": "2",
"Amt": "200.00",
"ReceivedDt": "2019-02-01T00:00:00",
"FromDt": "2019-02-01T00:00:00",
"ToDate": "2019-02-01T00:00:00",
"regionId": "WA13"
}
]
}
]