如何使用fillna方法填充数据框具有相同的整数

时间:2019-09-04 06:04:32

标签: python pandas

我有一个data_frame使用fill_na方法添加一些值并转换为json

这是我的data_frame

0                 home       Term    t1        t2     t3          t4   
1                 Yes          1   0.85      0.85   0.88         0.85   
2                 Yes          2   0.88      0.88  0.904         0.88   
3                 Yes          3+   0.91      0.91  0.928         0.91    
4                  No          1     1         1      1            1   
5                  No          2     1         1      1            1   
6                  No          3+     1         1      1            1  

我需要像这样操作data_frame

0                 home       Term    t1        t2     t3          t4   
1                 Yes          1    0.85      0.85   0.88         0.85   
2                 Yes          2    0.88      0.88  0.904         0.88   
3                 Yes          3    0.91      0.91  0.928         0.91 
3                 Yes          4    0.91      0.91  0.928         0.91 
3                 Yes          5    0.91      0.91  0.928         0.91    
4                  No          1      1         1      1            1   
5                  No          2      1         1      1            1   
6                  No          3      1         1      1            1   
7                  No          3      1         1      1            1   
8                  No          4      1         1      1            1   
9                  No          5      1         1      1            1  

我需要将其转换为这样的json

           {

“是”:{     “ 1”:{         “ t1”:0.85,         “ t2”:0.85,         “ t3”:0.88,         “ t4”:0.85

},
"2": {
    "t1": 0.88,
    "t2": 0.88,
    "t3": 0.904,
    "t4": 0.88

},
"3": {
    "t1": 0.91,
    "t2": 0.91,
    "t3": 0.928,
    "t4": 0.91

},
"4": {
     "t1": 0.91,
    "t2": 0.91,
    "t3": 0.928,
    "t4": 0.91
},
"5": {
    "t1": 0.91,
    "t2": 0.91,
    "t3": 0.928,
    "t4": 0.91

}}

“否”:{     “ 1”:{         “ t1”:1         “ t2”:1         “ t3”:1         “ t4”:1

},
"2": {
    "P1": 1,
    "P5": 1,
    "P7": 1,
    "P10": 1,
    "P20": 0
},
"3": {
    "t1": 1,
    "t2": 1,
    "t3": 1,
    "t4": 1
},
"4": {
    "t1": 1,
    "t2": 1,
    "t3": 1,
    "t4": 1
},
"5": {
   "t1": 1,
    "t2": 1,
    "t3": 1,
    "t4": 1

}}}

1 个答案:

答案 0 :(得分:2)

IIUC,将N+替换为[N, N+1, ..., any_max_value],然后执行explode

any_max_value = 6
df['Term'] = [i if not str(i).endswith('+') else list(range(int(i[:-1]), any_max_value+1)) for i in df['Term']]

更换后:

   0 home          Term    t1    t2     t3    t4
0  1  Yes             1  0.85  0.85  0.880  0.85
1  2  Yes             2  0.88  0.88  0.904  0.88
2  3  Yes  [3, 4, 5, 6]  0.91  0.91  0.928  0.91
3  4   No             1  1.00  1.00  1.000  1.00
4  5   No             2  1.00  1.00  1.000  1.00
5  6   No  [3, 4, 5, 6]  1.00  1.00  1.000  1.00

然后pandas.DataFrame.explode

new_df = df.explode('Term').reset_index(drop=True)
print(new_df)

输出:

    0 home Term    t1    t2     t3    t4
0   1  Yes    1  0.85  0.85  0.880  0.85
1   2  Yes    2  0.88  0.88  0.904  0.88
2   3  Yes    3  0.91  0.91  0.928  0.91
3   3  Yes    4  0.91  0.91  0.928  0.91
4   3  Yes    5  0.91  0.91  0.928  0.91
5   3  Yes    6  0.91  0.91  0.928  0.91
6   4   No    1  1.00  1.00  1.000  1.00
7   5   No    2  1.00  1.00  1.000  1.00
8   6   No    3  1.00  1.00  1.000  1.00
9   6   No    4  1.00  1.00  1.000  1.00
10  6   No    5  1.00  1.00  1.000  1.00
11  6   No    6  1.00  1.00  1.000  1.00

然后终于做jsonify:

j = {k: d.drop('home', 1).set_index('Term').to_dict(orient='index') for k, d in new_df.groupby('home')}

import json
print(json.dumps(j, indent=4))

输出:

{
    "No": {
        "2": {
            "t1": 1.0,
            "t2": 1.0,
            "t3": 1.0,
            "t4": 1.0
        },
        "3": {
            "t1": 1.0,
            "t2": 1.0,
            "t3": 1.0,
            "t4": 1.0
        },
        "4": {
            "t1": 1.0,
            "t2": 1.0,
            "t3": 1.0,
            "t4": 1.0
        },
        "5": {
            "t1": 1.0,
            "t2": 1.0,
            "t3": 1.0,
            "t4": 1.0
        },
        "6": {
            "t1": 1.0,
            "t2": 1.0,
            "t3": 1.0,
            "t4": 1.0
        },
        "1": {
            "t1": 1.0,
            "t2": 1.0,
            "t3": 1.0,
            "t4": 1.0
        }
    },
    "Yes": {
        "2": {
            "t1": 0.88,
            "t2": 0.88,
            "t3": 0.904,
            "t4": 0.88
        },
        "3": {
            "t1": 0.91,
            "t2": 0.91,
            "t3": 0.9279999999999999,
            "t4": 0.91
        },
        "4": {
            "t1": 0.91,
            "t2": 0.91,
            "t3": 0.9279999999999999,
            "t4": 0.91
        },
        "5": {
            "t1": 0.91,
            "t2": 0.91,
            "t3": 0.9279999999999999,
            "t4": 0.91
        },
        "6": {
            "t1": 0.91,
            "t2": 0.91,
            "t3": 0.9279999999999999,
            "t4": 0.91
        },
        "1": {
            "t1": 0.85,
            "t2": 0.85,
            "t3": 0.88,
            "t4": 0.85
        }
    }
}