我有一个数据集(Pandas数据框),如下所示:
Date Open High ... date year month
0 2002-05-23 1.156429 1.242857 ... 2002-05-23 2002 5
1 2002-05-24 1.214286 1.225000 ... 2002-05-24 2002 5
2 2002-05-28 1.213571 1.232143 ... 2002-05-28 2002 5
3 2002-05-29 1.164286 1.164286 ... 2002-05-29 2002 5
4 2002-05-30 1.107857 1.107857 ... 2002-05-30 2002 5
我如何获得每周,每月,每季度,6个月,每年等的第一个和最后一个观测值?我想从数据集中获取价格,然后计算各个时期的回报。
编辑:
您的代码为我提供了df1的以下输出:
Index first last
2002-05-26 1.15643 1.21429
2002-06-02 1.21357 1.07857
我想要:
First date Last date First value Last value
2002-05-26 2005-06-01 1.15643 1.21429
2002-06-02 2002-06-09 1.21357 1.07857
答案 0 :(得分:1)
我相信您需要DataFrame.resample
和first
和last
的总和:
df1 = df.resample('W', on='Date')['Open'].agg(['first','last'])
df2 = df.resample('M', on='Date')['Open'].agg(['first','last'])
df3 = df.resample('Q', on='Date')['Open'].agg(['first','last'])
df4 = df.resample('6M', on='Date')['Open'].agg(['first','last'])
df5 = df.resample('Y', on='Date')['Open'].agg(['first','last'])
编辑:
print (df)
Date Open
0 2002-05-23 1.1
1 2002-05-24 1.2
2 2002-05-28 1.3
3 2002-05-29 1.4
4 2002-05-30 1.5
5 2002-05-31 1.6
6 2002-06-01 1.7
7 2002-06-02 1.8
8 2002-06-03 1.9
9 2002-06-04 2.0
df['Date'] = pd.to_datetime(df['Date'])
df1 = df.resample('W', on='Date')['Open'].agg(['first','last'])
print (df1)
first last
Date
2002-05-26 1.1 1.2
2002-06-02 1.3 1.8
2002-06-09 1.9 2.0
df1 = df.set_index('Date').resample('W')['Open'].agg([('First date', lambda x: x.index[0]),
('Last date', lambda x: x.index[-1]),
('First value','first'),
('Last value','last')])
print (df1)
First date Last date First value Last value
Date
2002-05-26 2002-05-23 2002-05-24 1.1 1.2
2002-06-02 2002-05-28 2002-06-02 1.3 1.8
2002-06-09 2002-06-03 2002-06-04 1.9 2.0