组b时间和返回布尔值,如果组包含

时间:2017-12-06 15:33:05

标签: pandas dataframe

我的df如下。

Index                                Receiver     
1970-01-01 00:00:00.000000000         R1          
1970-01-01 00:00:00.800000000         R1          
1970-01-01 00:00:01.000287000         R2          
1970-01-01 00:00:01.600896000         R2          
1970-01-01 00:00:02.001388000         R1          
1970-01-01 00:00:02.004698000         R1          
1970-01-01 00:00:02.006706000         R2          
1970-01-01 00:00:02.501351000         R2          
1970-01-01 00:00:02.810382000         R2          
1970-01-01 00:00:03.001981000         R1          
1970-01-01 00:00:03.377116000         R1          
1970-01-01 00:00:03.701811000         R2          
1970-01-01 00:00:03.910326000         R2          
1970-01-01 00:00:03.951355000         R2         

如何从上面获得以下df?

Index                        R1   R2  
1970-01-01 00:00:00          1    0   
1970-01-01 00:00:01          0    1   
1970-01-01 00:00:02          1    1   
1970-01-01 00:00:03          1    1   

欣赏一个班轮代码。 问候, 阿西

2 个答案:

答案 0 :(得分:3)

我们可以将pivot_table与aggfunc size一起使用,然后将notnull()值转换为int,即

df.pivot_table(index = pd.Grouper(key='Index',freq='s'),columns='Receiver',aggfunc='size').notnull().astype(int)

Receiver                R1 R2
Index                        
1970-01-01 00:00:00      1  0
1970-01-01 00:00:01      0  1
1970-01-01 00:00:02      1  1
1970-01-01 00:00:03      1  1

答案 1 :(得分:2)

df.set_index('Index').Receiver.resample('S').apply(lambda x : ','.join(set(x))).str.get_dummies(sep=',')
Out[909]: 
                     R1  R2
Index                      
1970-01-01 00:00:00   1   0
1970-01-01 00:00:01   0   1
1970-01-01 00:00:02   1   1
1970-01-01 00:00:03   1   1