Question

我的数据框看起来像这样：

import pandas as pd
df = pd.read_csv('temp.csv', index_col=None)

print(df)
>>>>

   report   action    label
0       1  disable  label_a
1       1  disable  label_b
2       1  disable  label_c
3       2    alert  label_b
4       2    alert  label_c
5       3   ignore  label_a
6       3   ignore  label_c

我想做的是将其转换为：

   report   action  label_a  label_b  label_c
0       1  disable        1        1        1
1       2    alert        0        1        1
2       3   ignore        1        0        1

基本上按report（和action对action进行分组，但report对每个report行始终相同，然后将标签分解为拥有带有1或0的列，表示它们是否作为原始数据中的一行存在。

This SO question gets me pretty close，但我无法弄清楚如何按{{1}}进行分组而不会丢失分组行中的标签数据。

Answer 1

使用pivot_table()：

df.pivot_table(rows=("report", "action"), 
               cols="label", 
               values="label", 
               aggfunc="count").fillna(0)

输出：

label           label_a  label_b  label_c
report action                            
1      disable        1        1        1
2      alert          0        1        1
3      ignore         1        0        1

将堆叠的记录转换为pandas中的列

1 个答案: