我有一个数据框:
day Datavalue
2020-06-01 3.179695
2020-06-02 0.132487
2020-06-08 3.179695
2020-06-09 3.179695
2020-06-10 3.179695
我想设置一个日期范围,并将数据框中没有的任何日期添加为0,例如:
day Datavalue
2020-06-01 3.179695
2020-06-02 0.132487
2020-06-03 0
2020-06-04 0
2020-06-05 0
2020-06-06 0
2020-06-07 0
2020-06-08 3.179695
2020-06-09 3.179695
2020-06-10 3.179695
我尝试过
mydates = pd.period_range(date - timedelta(40), date + timedelta(40)
x = data.set_index('day')
x = data.reindex(mydates, fill_value=0)
但这只是将其全部设置为零
我在做什么错了?
谢谢
答案 0 :(得分:3)
假设要对整个DataFrame进行此操作,请使用asfreq
:
df.set_index('day').asfreq('1D', fill_value=0)
Datavalue
day
2020-06-01 3.179695
2020-06-02 0.132487
2020-06-03 0.000000
2020-06-04 0.000000
2020-06-05 0.000000
2020-06-06 0.000000
2020-06-07 0.000000
2020-06-08 3.179695
2020-06-09 3.179695
2020-06-10 3.179695
答案 1 :(得分:1)
类似的事情可能起作用:
delta = 2 # number of days before first value and after last value (as it seems to be needed from your code)
mydates = pd.period_range(df.date.iloc[0] - timedelta(delta), df.date.iloc[-1] + timedelta(delta))
# Change PeriodIndex object to datetime type:
mydates = mydates.to_timestamp()
# Create dates dataframe and merge with original df containing values
dates_df = pd.DataFrame(mydates, columns=['date'])
new_df= pd.merge(df, dates_df, on='date', how='outer').sort_values('date').fillna(0)