熊猫时间分组列

时间:2017-12-07 13:17:46

标签: pandas dataframe

我的df如下:

Index                                Receiver     Length         Retry
1970-01-01 00:00:00.000000000         R1          10             0
1970-01-01 00:00:00.800000000         R1          10             1
1970-01-01 00:00:01.000287000         R2          10             0
1970-01-01 00:00:01.600896000         R2          10             0
1970-01-01 00:00:02.001388000         R1          10             1
1970-01-01 00:00:02.004698000         R1          10             1
1970-01-01 00:00:02.006706000         R2          10             0
1970-01-01 00:00:02.501351000         R2          10             0
1970-01-01 00:00:02.810382000         R1          10             0
1970-01-01 00:00:03.001981000         R1          10             1
1970-01-01 00:00:03.377116000         R1          10             1
1970-01-01 00:00:03.701811000         R2          10             1
1970-01-01 00:00:03.910326000         R2          10             0
1970-01-01 00:00:03.951355000         R2          10             1

我需要按时间(1S)对df进行分组,然后在每个组中,如果每个R1和R2重试== 1,则添加长度列值。

我使用下面的代码,但是当R1和R2没有Retry = 1(即不满足条件)时,似乎忽略了行。

df2 = df.query("Retry == 1").groupby([pd.Grouper(freq='1S'), 'Receiver']).Length.sum().unstack().fillna(0)

所需的输出是:

Index                        R1    R2
1970-01-01 00:00:00          10    0
1970-01-01 00:00:01          0     0
1970-01-01 00:00:02          20    0
1970-01-01 00:00:03          20    20

可以找到类似的问题here

2 个答案:

答案 0 :(得分:1)

您需要reindex添加缺失的datetime

df2 = df2.reindex(pd.date_range(df2.index[0], df2.index[-1], freq='1S'), fill_value=0)
print (df2)
Receiver               R1    R2
1970-01-01 00:00:00  10.0   0.0
1970-01-01 00:00:01   0.0   0.0
1970-01-01 00:00:02  20.0   0.0
1970-01-01 00:00:03  20.0  20.0

答案 1 :(得分:1)

我首先使用pivot_table()来转移数据,然后对数据进行分组:

df['Value'] = df['Length']*df['Retry']
df2 = pd.pivot_table(df, index=df.index, columns='Receiver', values='Value')
df2 = df2.groupby([pd.Grouper(freq='1S')]).sum()