我有以下数据。如何添加所有日期(从1日到月末)?我如何从该数据集中删除星期六和星期日?
Date values
31/03/14 -0.0123
30/04/14 0.11168
30/06/14 0.0997
31/07/14 0.007
30/09/14 0.886
Date values
1/3/14
2/3/14
.....
..
31/3/14
1/4/14
2/4/14
....
.....
30/09/14
答案 0 :(得分:2)
import pandas as pd
data = '''\
Date values
31/03/14 -0.0123
30/04/14 0.11168
30/06/14 0.0997
31/07/14 0.007
30/09/14 0.886'''
# This operation includes reading the dataset, converting Date to Datetime and
# setting Date as index
df = pd.read_csv(pd.compat.StringIO(data),sep='\s+',parse_dates=['Date'],index_col='Date')
# Resample day
df = df.resample('D').sum() # or first() or mean()
# Remove weekdays smaller than 5 (saturday and sunday) and reset
df = df.loc[df.index.weekday < 5].reset_index()
print(df.head())
你得到(打印前5行):
Date values
0 2014-03-31 -0.0123
1 2014-04-01 NaN
2 2014-04-02 NaN
3 2014-04-03 NaN
4 2014-04-04 NaN
相当于假设您已经加载了数据集(compact)。如果你想排除这些月份,我还在这里添加了May或August mask。
df = df.set_index(pd.to_datetime(df.Date)).drop('Date', axis = 1)
df = df.resample('D').first()
m1 = df.index.weekday < 5 # mask1 (no sat/sun)
m2 = ~df.index.month.isin([5,8]) # mask2 (not May or August)
df = df.loc[m1 & m2].reset_index()
答案 1 :(得分:1)
您可以使用date_range
df.Date=pd.to_datetime(df.Date)
s=pd.DataFrame({'Date':sum([pd.date_range(x,y,freq='D').tolist() for x,y in zip(pd.to_datetime(df.Date.dt.strftime('%Y-%m')),df.Date)],[])})
s=s.merge(df)
s=s[s.Date.dt.weekday<5]