熊猫计算满足条件时的列之间的时间差

时间:2019-02-15 09:03:33

标签: python pandas datetime difference

我有一个熊猫df,其中有2列 Day (日期时间格式的日期)和故障数

我想创建两个新列,第一列是 Previous 故障事件的天数(已经完成),第二列是 Next 的天数崩溃事件(我很难做到)。

Day            Number of breakdowns    Days from Previous Breakdown Event
2017-01-09                   0.0                                   0                                             
2017-01-12                   0.0                                   0
2017-01-13                   0.0                                   0
2017-01-14                   0.0                                   0
2017-01-16                   1.0                                   0
2017-01-17                   0.0                                   1
2017-01-18                   0.0                                   2
2017-01-19                   1.0                                   0
2017-01-20                   0.0                                   1
2017-01-21                   0.0                                   2
2017-01-23                   1.0                                   0

上次发生故障的天数计算发生故障以来经过的天数。

代码:

s = df.groupby(df['Number of breakdowns'].ne(0).cumsum())['Day'].transform('first')
df['Days from Previous Breakdown Event'] = (df['Day'] - s).dt.days
zeros_index = df['Number of breakdowns'].ne(0).idxmax()
df.loc[:zeros_index,'Days from Previous Breakdown Event'] = 0

我需要“下一次的天数”细分列,我希望这样:

Day            Number of breakdowns    Days from Next Breakdown Event
2017-01-09                   0.0                                   7                                             
2017-01-12                   0.0                                   4
2017-01-13                   0.0                                   3
2017-01-14                   0.0                                   2
2017-01-16                   1.0                                   0
2017-01-17                   0.0                                   2
2017-01-18                   0.0                                   1
2017-01-19                   1.0                                   0
2017-01-20                   0.0                                   3
2017-01-21                   0.0                                   2
2017-01-23                   1.0                                   0

1 个答案:

答案 0 :(得分:1)

通过iloc[::-1]transformlast交换订单,然后也交换s - df['Day']

s = df.groupby(df['Number of breakdowns'].iloc[::-1].ne(0).cumsum())['Day'].transform('last')
df['Days from Next Breakdown Event'] = (s - df['Day']).dt.days
print (df)
          Day  Number of breakdowns  Days from Previous Breakdown Event  \
0  2017-01-09                   0.0                                   0   
1  2017-01-12                   0.0                                   0   
2  2017-01-13                   0.0                                   0   
3  2017-01-14                   0.0                                   0   
4  2017-01-16                   1.0                                   0   
5  2017-01-17                   0.0                                   1   
6  2017-01-18                   0.0                                   2   
7  2017-01-19                   1.0                                   0   
8  2017-01-20                   0.0                                   1   
9  2017-01-21                   0.0                                   2   
10 2017-01-23                   1.0                                   0   

    Days from Next Breakdown Event  
0                                7  
1                                4  
2                                3  
3                                2  
4                                0  
5                                2  
6                                1  
7                                0  
8                                3  
9                                2  
10                               0  

详细信息

print (s)
0    2017-01-16
1    2017-01-16
2    2017-01-16
3    2017-01-16
4    2017-01-16
5    2017-01-19
6    2017-01-19
7    2017-01-19
8    2017-01-23
9    2017-01-23
10   2017-01-23
Name: Day, dtype: datetime64[ns]