Question

我具有以下形式的数据框：

     id   |     x1     |   x2
---------------------------------
0     1   |    Apples  |    5 
1     1   |    Oranges |    3 
2     1   |    Apples  |    6 
3     2   |    Bananas |   4.5 
4     2   |    Oranges |    7 
5     3   |    Oranges |   5.5 
6     3   |    Apples  |    5

我想遍历数据帧并为每个以'id'record_<id>.json索引的行组编写一个新的json文件：

例如 record_1.json ：

{ "record" : [
       { "x1": "Apples" , 
         "x2":    5  
        },
       { "x1": "Oranges" , 
         "x2":    3  
        },
       { "x1": "Apples" , 
         "x2":    6  
        } 
   ]
}

record_2.json

{ "record" : [
       { "x1": "Bananas" , 
         "x2":   4.5  
        },
       { "x1": "Oranges" , 
         "x2":    7  
        } 
   ]
}

等...

有没有简单的方法可以做到这一点？

Answer 1

IIUC，只需遍历groupby对象：

for _, i in df.groupby("id"):
    print ({"record":i.drop("id",1).to_dict("records")})

{'record': [{'x1': 'Apples', 'x2': 5.0}, {'x1': 'Oranges', 'x2': 3.0}, {'x1': 'Apples', 'x2': 6.0}]}
{'record': [{'x1': 'Bananas', 'x2': 4.5}, {'x1': 'Oranges', 'x2': 7.0}]}
{'record': [{'x1': 'Oranges', 'x2': 5.5}, {'x1': 'Apples', 'x2': 5.0}]}

遍历熊猫并写入由值索引的每组行？

1 个答案: