Question

我有一个像下面这样的pandas数据框

idx, f1, f2, f3
1,   a,  a,  b
2,   b,  a,  c
3,   a,  b,  c
.
.
.
87   e,  e,  e

我需要将其他列转换为基于idx列的词典列表。所以，最终结果应该是：

idx, features
1 ,  [{f1:a, f2:a, f3:b}, {f1:b, f2:a, f3:c}, {f1:a, f2:b, f3:c}]
.
.
.
87,  [{f1: e, f2:e, f3:e}]

是否有可能在pandas中使用groupby做这样的事情？

Answer 1

您可以index使用groupby，然后使用apply to_json：

print df
    f1 f2 f3
idx         
1    a  a  b
1    b  a  c
1    a  b  c
87   e  e  e

print df.groupby(level=0).apply(lambda x: x.to_json(orient='records'))

1     [{"f1":"a","f2":"a","f3":"b"},{"f1":"b","f2":"...
87                       [{"f1":"e","f2":"e","f3":"e"}]
dtype: object

或者，如果列idx不是index：

print df
   idx f1 f2 f3
0    1  a  a  b
1    1  b  a  c
2    1  a  b  c
3   87  e  e  e

print df.groupby('idx').apply(lambda x: x.to_json(orient='records'))
idx
1     [{"idx":1,"f1":"a","f2":"a","f3":"b"},{"idx":1...
87              [{"idx":87,"f1":"e","f2":"e","f3":"e"}]
dtype: object

pandas groupby并转换为json列表

1 个答案: