假设有一个数据框有 2 列 ds, y(fbprophet https://facebook.github.io/prophet/docs/quick_start.html 的输入)
ds y
2021-03-17 3135.73
2021-03-18 3027.99
2021-03-19 3074.96
2021-03-22 3110.87
2021-03-23 3110.87
2021-03-24 3110.87
2021-03-25 3110.87
这是生成此数据框的代码
data = \
{
'ds':['2021-03-17','2021-03-18','2021-03-19','2021-03-22','2021-03-23','2021-03-24','2021-03-25'],
'y':['3135.73','3027.99','3074.96','3110.87','3110.87','3110.87','3110.87']
}
df = pd.DataFrame(data)
我如何生成 2 个新的布尔列 on_season、off_season
基于循环中的“日期”?
例如,如果 ds 列在 date 和 date-2 之间,则 on_season=TRUE, off_season = OFF
自这里循环中的日期以来所需的输出是“2021-03-19”、“2021-03-25”。所以 on_season 介于 '2021-03-19' 和 '2021-03-17' 和 '2021-03-25' 和 '2021-03-23' 之间。
ds y on_season off_season
2021-03-17 3135.73 TRUE FALSE
2021-03-18 3027.99 TRUE FALSE
2021-03-19 3074.96 TRUE FALSE
2021-03-22 3110.87 FALSE TRUE
2021-03-23 3110.87 TRUE FALSE
2021-03-24 3110.87 TRUE FALSE
2021-03-25 3110.87 TRUE FALSE
这是工作进度代码
# the date in a loop
for date in ['2021-03-19','2021-03-25']:
date_str=''
date_str=date_str.join(date) # convert date from list into string
on_season=pd.to_datetime(date_str)
on_season = on_season.strftime("%Y-%m-%d")
#try to make the day range
off_season=pd.to_datetime(date_str) - timedelta(2)
off_season=off_season.strftime("%Y-%m-%d") # convert datetime object to string
# I am stuck in the dataframe part
答案 0 :(得分:0)
看来您只需要计算相邻行之间的 ds
差异:
df.ds = pd.to_datetime(df.ds)
# check if difference between two rows are smaller than 2
df['on_season'] = df.ds.diff().dt.days.fillna(0) < 2
df['off_season'] = ~df['on_season']
df
ds y on_season off_season
0 2021-03-17 3135.73 True False
1 2021-03-18 3027.99 True False
2 2021-03-19 3074.96 True False
3 2021-03-22 3110.87 False True
4 2021-03-23 3110.87 True False
5 2021-03-24 3110.87 True False
6 2021-03-25 3110.87 True False