用0填充缺失的日期(天)值

时间:2020-07-08 21:23:25

标签: python pandas indexing

我有一个数据框:

     day  Datavalue
    2020-06-01   3.179695
    2020-06-02   0.132487
    2020-06-08   3.179695
    2020-06-09   3.179695
    2020-06-10   3.179695

我想设置一个日期范围,并将数据框中没有的任何日期添加为0,例如:

     day  Datavalue
    2020-06-01   3.179695
    2020-06-02   0.132487
    2020-06-03   0
    2020-06-04   0
    2020-06-05   0
    2020-06-06   0
    2020-06-07   0
    2020-06-08   3.179695
    2020-06-09   3.179695
    2020-06-10   3.179695

我尝试过

      mydates = pd.period_range(date - timedelta(40), date + timedelta(40)
      x = data.set_index('day') 
      x = data.reindex(mydates, fill_value=0)


但这只是将其全部设置为零

y

我在做什么错了?

谢谢

2 个答案:

答案 0 :(得分:3)

假设要对整个DataFrame进行此操作,请使用asfreq

df.set_index('day').asfreq('1D', fill_value=0)

            Datavalue
day                  
2020-06-01   3.179695
2020-06-02   0.132487
2020-06-03   0.000000
2020-06-04   0.000000
2020-06-05   0.000000
2020-06-06   0.000000
2020-06-07   0.000000
2020-06-08   3.179695
2020-06-09   3.179695
2020-06-10   3.179695

答案 1 :(得分:1)

类似的事情可能起作用:

delta = 2 # number of days before first value and after last value (as it seems to be needed from your code)

mydates = pd.period_range(df.date.iloc[0] - timedelta(delta), df.date.iloc[-1] + timedelta(delta))

# Change PeriodIndex object to datetime type:
mydates = mydates.to_timestamp() 

# Create dates dataframe and merge with original df containing values
dates_df = pd.DataFrame(mydates, columns=['date'])
new_df= pd.merge(df, dates_df, on='date', how='outer').sort_values('date').fillna(0)