我正在使用
pd.read_sql_query()
从数据库获取数据,然后使用
to_json(orient='reords')
这是数据框:
(1)
price_formula_id premium product_id exchange product_name product_code weight
0 30064 0.0 c001 CME 2018 CL 0.3
1 30064 0.0 c002 CME 2018 CL 0.7
(2)
price_formula_id premium product_id exchange product_name product_code weight
0 30064 NONE c001 CME 2018 CL 0.3
1 30064 NONE c002 CME 2018 CL 0.7
转换为这种形式。
[{
"price_formula_id": "30064",
"premium": "0.0",
"product_id": "c001",
"exchange": "CME",
"product_name": "2018",
"product_code": "CL",
"weight": "0.3"
},
{
"price_formula_id": "30064",
"premium": "0.0",
"product_id": "c002",
"exchange": "CME",
"product_name": "2018",
"product_code": "CL",
"weight": "0.7"
}]
但是我真正想要的应该是这样的:
{
"price_formula_id": "30064",
"premium": "0.0",
"basket":
[
{"product_id": "c001",
"exchange": "CME",
"product_name": "2018",
"product_code": "CL",
"weight": "0.3"
},
{
"product_id": "c002",
"exchange": "CME",
"product_name": "2018",
"product_code": "CL",
"weight": "0.7"
}
]
}
我需要对相同的信息进行分组,并为其余部分设置一个新的索引“篮子”。 我该怎么做? 非常感谢。
答案 0 :(得分:1)
将groupby
与带有自定义功能的to_dict
一起用于所有被difference
,reset_index
过滤的列,并最后将其转换为to_json
:
cols = df.columns.difference(['price_formula_id','premium'])
j = (df.groupby(['price_formula_id','premium'])[cols]
.apply(lambda x: x.to_dict('r'))
.reset_index(name='basket')
.to_json(orient='records'))
print (j)
[{
"price_formula_id": 30064,
"premium": 0.0,
"basket": [{
"exchange": "CME",
"product_code": "CL",
"product_id": "c001",
"product_name": 2018,
"weight": 0.3
},
{
"exchange": "CME",
"product_code": "CL",
"product_id": "c002",
"product_name": 2018,
"weight": 0.7
}
]
}]