我有一个data_frame使用fill_na方法添加一些值并转换为json
这是我的data_frame
0 home Term t1 t2 t3 t4
1 Yes 1 0.85 0.85 0.88 0.85
2 Yes 2 0.88 0.88 0.904 0.88
3 Yes 3+ 0.91 0.91 0.928 0.91
4 No 1 1 1 1 1
5 No 2 1 1 1 1
6 No 3+ 1 1 1 1
我需要像这样操作data_frame
0 home Term t1 t2 t3 t4
1 Yes 1 0.85 0.85 0.88 0.85
2 Yes 2 0.88 0.88 0.904 0.88
3 Yes 3 0.91 0.91 0.928 0.91
3 Yes 4 0.91 0.91 0.928 0.91
3 Yes 5 0.91 0.91 0.928 0.91
4 No 1 1 1 1 1
5 No 2 1 1 1 1
6 No 3 1 1 1 1
7 No 3 1 1 1 1
8 No 4 1 1 1 1
9 No 5 1 1 1 1
我需要将其转换为这样的json
{
“是”:{ “ 1”:{ “ t1”:0.85, “ t2”:0.85, “ t3”:0.88, “ t4”:0.85
},
"2": {
"t1": 0.88,
"t2": 0.88,
"t3": 0.904,
"t4": 0.88
},
"3": {
"t1": 0.91,
"t2": 0.91,
"t3": 0.928,
"t4": 0.91
},
"4": {
"t1": 0.91,
"t2": 0.91,
"t3": 0.928,
"t4": 0.91
},
"5": {
"t1": 0.91,
"t2": 0.91,
"t3": 0.928,
"t4": 0.91
}}
“否”:{ “ 1”:{ “ t1”:1 “ t2”:1 “ t3”:1 “ t4”:1
},
"2": {
"P1": 1,
"P5": 1,
"P7": 1,
"P10": 1,
"P20": 0
},
"3": {
"t1": 1,
"t2": 1,
"t3": 1,
"t4": 1
},
"4": {
"t1": 1,
"t2": 1,
"t3": 1,
"t4": 1
},
"5": {
"t1": 1,
"t2": 1,
"t3": 1,
"t4": 1
}}}
答案 0 :(得分:2)
IIUC,将N+
替换为[N, N+1, ..., any_max_value]
,然后执行explode
:
any_max_value = 6
df['Term'] = [i if not str(i).endswith('+') else list(range(int(i[:-1]), any_max_value+1)) for i in df['Term']]
更换后:
0 home Term t1 t2 t3 t4
0 1 Yes 1 0.85 0.85 0.880 0.85
1 2 Yes 2 0.88 0.88 0.904 0.88
2 3 Yes [3, 4, 5, 6] 0.91 0.91 0.928 0.91
3 4 No 1 1.00 1.00 1.000 1.00
4 5 No 2 1.00 1.00 1.000 1.00
5 6 No [3, 4, 5, 6] 1.00 1.00 1.000 1.00
然后pandas.DataFrame.explode
:
new_df = df.explode('Term').reset_index(drop=True)
print(new_df)
输出:
0 home Term t1 t2 t3 t4
0 1 Yes 1 0.85 0.85 0.880 0.85
1 2 Yes 2 0.88 0.88 0.904 0.88
2 3 Yes 3 0.91 0.91 0.928 0.91
3 3 Yes 4 0.91 0.91 0.928 0.91
4 3 Yes 5 0.91 0.91 0.928 0.91
5 3 Yes 6 0.91 0.91 0.928 0.91
6 4 No 1 1.00 1.00 1.000 1.00
7 5 No 2 1.00 1.00 1.000 1.00
8 6 No 3 1.00 1.00 1.000 1.00
9 6 No 4 1.00 1.00 1.000 1.00
10 6 No 5 1.00 1.00 1.000 1.00
11 6 No 6 1.00 1.00 1.000 1.00
然后终于做jsonify:
j = {k: d.drop('home', 1).set_index('Term').to_dict(orient='index') for k, d in new_df.groupby('home')}
import json
print(json.dumps(j, indent=4))
输出:
{
"No": {
"2": {
"t1": 1.0,
"t2": 1.0,
"t3": 1.0,
"t4": 1.0
},
"3": {
"t1": 1.0,
"t2": 1.0,
"t3": 1.0,
"t4": 1.0
},
"4": {
"t1": 1.0,
"t2": 1.0,
"t3": 1.0,
"t4": 1.0
},
"5": {
"t1": 1.0,
"t2": 1.0,
"t3": 1.0,
"t4": 1.0
},
"6": {
"t1": 1.0,
"t2": 1.0,
"t3": 1.0,
"t4": 1.0
},
"1": {
"t1": 1.0,
"t2": 1.0,
"t3": 1.0,
"t4": 1.0
}
},
"Yes": {
"2": {
"t1": 0.88,
"t2": 0.88,
"t3": 0.904,
"t4": 0.88
},
"3": {
"t1": 0.91,
"t2": 0.91,
"t3": 0.9279999999999999,
"t4": 0.91
},
"4": {
"t1": 0.91,
"t2": 0.91,
"t3": 0.9279999999999999,
"t4": 0.91
},
"5": {
"t1": 0.91,
"t2": 0.91,
"t3": 0.9279999999999999,
"t4": 0.91
},
"6": {
"t1": 0.91,
"t2": 0.91,
"t3": 0.9279999999999999,
"t4": 0.91
},
"1": {
"t1": 0.85,
"t2": 0.85,
"t3": 0.88,
"t4": 0.85
}
}
}