我在pandas数据框中有一些时间序列数据:
prices.head()
Time A B C D
0 2012-01-02 08:00:30 NaN 47.1650 31.51 58.16
1 2012-01-02 08:01:00 NaN 47.2400 31.48 58.19
2 2012-01-02 08:01:30 NaN 47.2750 31.46 58.21
3 2012-01-02 08:02:00 NaN 47.3250 31.40 58.17
4 2012-01-02 08:02:30 NaN 47.3325 31.42 58.07
我想创建4个包含每天收盘价的新列。我怎么能这样做?
与第1天相关的样本应具有第1天的收盘价,依此类推......
答案 0 :(得分:0)
您可以groupby
约会,然后选择每个小组的最后一个,然后加入。
df['date'] = df.Time.dt.date
print df.join(df.groupby('date')[['A','B','C','D']].last(), rsuffix='_close', on='date')
Time A B C D date A_close \
0 2012-01-02 08:00:30 NaN 47.1650 31.51 58.16 2012-01-02 NaN
1 2012-01-02 08:01:00 NaN 47.2400 31.48 58.19 2012-01-02 NaN
2 2012-01-02 08:01:30 NaN 47.2750 31.46 58.21 2012-01-02 NaN
3 2012-01-02 08:02:00 NaN 47.3250 31.40 58.17 2012-01-02 NaN
4 2012-01-02 08:02:30 NaN 47.3325 31.42 58.07 2012-01-02 NaN
B_close C_close D_close
0 47.3325 31.42 58.07
1 47.3325 31.42 58.07
2 47.3325 31.42 58.07
3 47.3325 31.42 58.07
4 47.3325 31.42 58.07