我有一个DF,我想计算收益日期之间的时间。
DF
Date Earnings Reported
2018-04-02 1
2018-04-03 0
2018-04-04 0
DF - 所需
Date Earnings Reported DaySinceEarnings
2018-04-02 1 0
2018-04-03 0 1
2018-04-04 0 2
我尝试做一个lambda函数,但无法正常工作
df['DaySinceEarnings'] = df.groupby['Earnings Reported'].apply(lambda x: (x == '1') * (x == '1').cumsum())
答案 0 :(得分:0)
import pandas as pd
df = pd.DataFrame(
{'Date': ['2018-04-02',
'2018-04-03',
'2018-04-04',
'2018-04-05',
'2018-04-06',
'2018-04-07', ],
'Earnings Reported': [1, 0, 0, 1, 1, 0]}
)
df['Date'] = pd.to_datetime(df['Date'])
def only_include_reported_days(x):
x['DaySinceEarnings'] = 0
if x['Earnings Reported'] == 1:
return x
sub = df[(df['Date'] < x['Date']) &
(df['Earnings Reported'] == 1)]
x['DaySinceEarnings'] = (x['Date'] - max(sub['Date'])).days
return x