我正在尝试将CSV文件转换为JSON。
CSV文件:
id,name,email
1,jim,test@gmail.com
1,jim,test2@gmail.com
2,kim,test3@gmail.com
预期产量
{"row" : {"id":1,"name":"jim","email": ["test@gmail.com","test1@gmail.com"]}},
{"row" : {"id":2,"name":"kim","email": "test3@gmail.com"}}
答案 0 :(得分:0)
您可以使用熊猫来做到这一点:
import pandas as pd
df = pd.read_csv('test.csv', index_col=None)
print(df)
#Output
id name email
0 1 jim test@gmail.com
1 1 jim test2@gmail.com
2 2 kim test3@gmail.com
df1 = df.groupby(['id', 'name'])['email'].apply(list).reset_index()
df_json = df1.to_json(orient='index')
print(df_json)
#Output:
{"0":{"id":1,"name":"jim","email":["test@gmail.com","test2@gmail.com"]},"1":{"id":2,"name":"kim","email":["test3@gmail.com"]}}
答案 1 :(得分:0)
这里有些笨重的实现
import csv
import json
with open('data.csv') as csvfile:
reader = csv.reader(csvfile)
# Get headers
headers = next(reader, None)
result = {}
for row in reader:
# Combine header and line to get a dict
data = dict(zip(headers, row))
if data['id'] not in result:
data.update({'email': [data.pop('email')]})
result[data['id']] = data
else:
# Aware if id and name fields are not consistant
assert data['name'] == result[data['id']]['name']
result[data['id']]['email'].append(data['email'])
for rec in result.values():
try:
# try to unpack as a single value and if it fails leave as is
rec['email'], = rec['email']
except ValueError:
pass
print(json.dumps({'row': rec}))