我有一个数据帧df
df:
col1 col2 col3
1 2 3
4 5 6
7 8 9
我正在寻找的json是:
{
"col1": 1,
"col1": 4,
"col1": 7,
},
{
"col2": 2,
"col2": 5,
"col2": 8
},
{
"col3": 3,
"col3": 6,
"col3": 9,
}
我尝试过df.to_json,但不起作用
df.to_json(orients=records)
it gives this output
'[{"col1":1,"col2":2,"col3":3},{"col1":4,"col2":5,"col3":6},
{"col1":7,"col2":8,"col3":9}]
这不是我想要的输出
如何使用pandas / python以最有效的方式做到这一点?
答案 0 :(得分:1)
您需要做
df.to_json('file.json', orient='records')
请注意,这将为您提供一系列对象:
[
{
"col1": 1,
"col1": 4,
"col1": 7
},
{
"col2": 2,
"col2": 5,
"col2": 8
},
{
"col3": 3,
"col3": 6,
"col3": 9
}
]
您也可以
df.to_json('file.json', orient='records', lines=True)
如果您希望输出如下:
{"col1":1,"col1":4,"col1":7},
{"col2":2,"col2":5,"col2":8},
{"col3":3,"col3":6,"col3":9}
要美化输出:
pip install jq
cat file.json | jq '.' > new_file.json
答案 1 :(得分:1)
JSON文件在python中被视为字典,您指定的JSON文件具有重复的键,并且只能解析为字符串(并且不使用python json库)。 以下代码:
import json
from io import StringIO
df = pd.DataFrame(np.arange(1,10).reshape((3,3)), columns=['col1','col2','col3'])
io = StringIO()
df.to_json(io, orient='columns')
parsed = json.loads(io.getvalue())
with open("pretty.json", '+w') as of:
json.dump(parsed, of, indent=4)
将产生以下JSON:
{
"col1": {
"0": 1,
"1": 4,
"2": 7
},
"col2": {
"0": 2,
"1": 5,
"2": 8
},
"col3": {
"0": 3,
"1": 6,
"2": 9
}
}
,您可以稍后将其加载到python中。或者,此脚本将完全生成您想要的字符串:
with open("exact.json", "w+") as of:
of.write('[\n\t{\n' + '\t},\n\t{\n'.join(["".join(["\t\t\"%s\": %s,\n"%(c, df[c][i]) for i in df.index]) for c in df.columns])+'\t}\n]')
,输出将是:
[
{
"col1": 1,
"col1": 4,
"col1": 7,
},
{
"col2": 2,
"col2": 5,
"col2": 8,
},
{
"col3": 3,
"col3": 6,
"col3": 9,
}
]
编辑:固定括号
答案 2 :(得分:0)
这种JSON有效期为not recommended, and strongly so,因为在反序列化期间,除了最后一个JSON数组元素外,您将丢失所有其他元素。