我有一个熊猫df,其中有2列 Day (日期时间格式的日期)和故障数。
我想创建两个新列,第一列是 Previous 故障事件的天数(已经完成),第二列是 Next 的天数崩溃事件(我很难做到)。
Day Number of breakdowns Days from Previous Breakdown Event
2017-01-09 0.0 0
2017-01-12 0.0 0
2017-01-13 0.0 0
2017-01-14 0.0 0
2017-01-16 1.0 0
2017-01-17 0.0 1
2017-01-18 0.0 2
2017-01-19 1.0 0
2017-01-20 0.0 1
2017-01-21 0.0 2
2017-01-23 1.0 0
上次发生故障的天数计算发生故障以来经过的天数。
代码:
s = df.groupby(df['Number of breakdowns'].ne(0).cumsum())['Day'].transform('first')
df['Days from Previous Breakdown Event'] = (df['Day'] - s).dt.days
zeros_index = df['Number of breakdowns'].ne(0).idxmax()
df.loc[:zeros_index,'Days from Previous Breakdown Event'] = 0
我需要“下一次的天数”细分列,我希望这样:
Day Number of breakdowns Days from Next Breakdown Event
2017-01-09 0.0 7
2017-01-12 0.0 4
2017-01-13 0.0 3
2017-01-14 0.0 2
2017-01-16 1.0 0
2017-01-17 0.0 2
2017-01-18 0.0 1
2017-01-19 1.0 0
2017-01-20 0.0 3
2017-01-21 0.0 2
2017-01-23 1.0 0
答案 0 :(得分:1)
通过iloc[::-1]
与transform
和last
交换订单,然后也交换s - df['Day']
:
s = df.groupby(df['Number of breakdowns'].iloc[::-1].ne(0).cumsum())['Day'].transform('last')
df['Days from Next Breakdown Event'] = (s - df['Day']).dt.days
print (df)
Day Number of breakdowns Days from Previous Breakdown Event \
0 2017-01-09 0.0 0
1 2017-01-12 0.0 0
2 2017-01-13 0.0 0
3 2017-01-14 0.0 0
4 2017-01-16 1.0 0
5 2017-01-17 0.0 1
6 2017-01-18 0.0 2
7 2017-01-19 1.0 0
8 2017-01-20 0.0 1
9 2017-01-21 0.0 2
10 2017-01-23 1.0 0
Days from Next Breakdown Event
0 7
1 4
2 3
3 2
4 0
5 2
6 1
7 0
8 3
9 2
10 0
详细信息:
print (s)
0 2017-01-16
1 2017-01-16
2 2017-01-16
3 2017-01-16
4 2017-01-16
5 2017-01-19
6 2017-01-19
7 2017-01-19
8 2017-01-23
9 2017-01-23
10 2017-01-23
Name: Day, dtype: datetime64[ns]