在pandas中添加所有缺少的日期

时间:2018-03-31 22:33:29

标签: python pandas

我有以下数据。如何添加所有日期(从1日到月末)?我如何从该数据集中删除星期六和星期日?

Date        values
31/03/14    -0.0123
30/04/14    0.11168
30/06/14    0.0997
31/07/14    0.007
30/09/14    0.886



Date    values
1/3/14
2/3/14
.....
..
31/3/14
1/4/14
2/4/14
....
.....
30/09/14

2 个答案:

答案 0 :(得分:2)

假设您可以从csv

重新加载数据集
import pandas as pd

data = '''\
Date        values
31/03/14    -0.0123
30/04/14    0.11168
30/06/14    0.0997
31/07/14    0.007
30/09/14    0.886'''

# This operation includes reading the dataset, converting Date to Datetime and
# setting Date as index
df = pd.read_csv(pd.compat.StringIO(data),sep='\s+',parse_dates=['Date'],index_col='Date')

# Resample day
df = df.resample('D').sum()  # or first() or mean() 

# Remove weekdays smaller than 5 (saturday and sunday) and reset
df = df.loc[df.index.weekday < 5].reset_index()

print(df.head())

你得到(打印前5行):

        Date  values
0 2014-03-31 -0.0123
1 2014-04-01     NaN
2 2014-04-02     NaN
3 2014-04-03     NaN
4 2014-04-04     NaN

假设您已加载数据集

相当于假设您已经加载了数据集(compact)。如果你想排除这些月份,我还在这里添加了May或August mask。

df = df.set_index(pd.to_datetime(df.Date)).drop('Date', axis = 1)
df = df.resample('D').first()
m1 = df.index.weekday < 5          # mask1 (no sat/sun)
m2 = ~df.index.month.isin([5,8])   # mask2 (not May or August)
df = df.loc[m1 & m2].reset_index() 

答案 1 :(得分:1)

您可以使用date_range

df.Date=pd.to_datetime(df.Date)
s=pd.DataFrame({'Date':sum([pd.date_range(x,y,freq='D').tolist() for x,y in zip(pd.to_datetime(df.Date.dt.strftime('%Y-%m')),df.Date)],[])})

s=s.merge(df)
s=s[s.Date.dt.weekday<5]