我有一个pandas数据帧,时间戳为索引:
我想将其转换为获取具有每日值的数据框,但无需重新采样原始数据帧(不能求和或平均每小时数据)。理想情况下,我希望每天在向量中获取24个每日值,例如:
有没有办法快速完成这项工作?
谢谢!
答案 0 :(得分:1)
IIUC您可以在索引的groupby
属性上date
,然后应用将值汇总到列表中的lambda:
In [21]:
# generate some data
df = pd.DataFrame({'GFS_rad':np.random.randn(100), 'GFS_tmp':np.random.randn(100)}, index=pd.date_range(dt.datetime(2016,1,1), freq='1h', periods=100))
df.groupby(df.index.date)['GFS_rad','GFS_tmp'].agg(lambda x: [x['GFS_rad'].values,x['GFS_tmp'].values])
Out[21]:
GFS_rad \
2016-01-01 [-0.324115177542, 1.59297335764, 0.58118555943...
2016-01-02 [-0.0547016526463, -1.10093451797, -1.55790161...
2016-01-03 [-0.34751220092, 1.06246918632, 0.181218794826...
2016-01-04 [0.950977469848, 0.422905080529, 1.98339145764...
2016-01-05 [-0.405124861624, 0.141470757613, -0.191169333...
GFS_tmp
2016-01-01 [-2.36889710412, -0.557972678049, -1.293544410...
2016-01-02 [-0.125562429825, -0.018852674365, -0.96735945...
2016-01-03 [0.802961514703, -1.68049099535, -0.5116769061...
2016-01-04 [1.35789157665, 1.37583167965, 0.538638510171,...
2016-01-05 [-0.297611872638, 1.10546853812, -0.8726761667...