我想计算自上次故障发生以来的天数。 我的表具有日期时间格式的日期列(天)和细分数列。
print (df)
Day Number of breakdowns
0 2017-01-09 1.0
1 2017-01-12 0.0
2 2017-01-13 0.0
3 2017-01-14 0.0
4 2017-01-16 3.0
5 2017-01-17 0.0
6 2017-01-18 0.0
7 2017-01-19 1.0
8 2017-01-20 0.0
9 2017-01-21 0.0
10 2017-01-23 1.0
答案 0 :(得分:1)
首先将不等于ne
的Number of breakdowns
与不等于cumsum
的累积总和进行比较,以得出每个组的变换first
值,因此可以将timedelta减去并将其转换为{{3} }:
df['Day'] = pd.to_datetime(df['Day'])
s = df.groupby(df['Number of breakdowns'].ne(0).cumsum())['Day'].transform('first')
df['New'] = (df['Day'] - s).dt.days
print (df)
Day Number of breakdowns New
0 2017-01-09 1.0 0
1 2017-01-12 0.0 3
2 2017-01-13 0.0 4
3 2017-01-14 0.0 5
4 2017-01-16 3.0 0
5 2017-01-17 0.0 1
6 2017-01-18 0.0 2
7 2017-01-19 1.0 0
8 2017-01-20 0.0 1
9 2017-01-21 0.0 2
10 2017-01-23 1.0 0