我有一个数据帧df_60
,其时间间隔为60分钟。另一个粒度为30分钟的df_30
。我想将值从df_60
的一列移到df_30
的一列,并保持该值出现的持续时间。
所以说我有一个日期2011-01-05 00:00:00 0
,一个小时粒度,并且它在val
列中的值为1。如何“填写” 30分钟的值60分钟数据框中的某列始终等于x?
>>>df_60
dt_hr_idx val #here val = 1 for times between 2am and 4am
2011-01-05 00:00:00 0
2011-01-05 01:00:00 0
2011-01-05 02:00:00 1
2011-01-05 03:00:00 1
2011-01-05 04:00:00 0
>>>df_30
dt_hlaf_hr_idx val #df_30 val column is currently blank
2011-01-05 00:00:00 0
2011-01-05 00:30:00 0
2011-01-05 01:00:00 0
2011-01-05 01:30:00 0
2011-01-05 02:00:00 0
2011-01-05 02:30:00 0
2011-01-05 03:00:00 0
2011-01-05 03:30:00 0
2011-01-05 04:00:00 0
#desired df
df_30
dt_hlaf_hr_idx val #val should be 1 for values between 2am and 4am
2011-01-05 00:00:00 0
2011-01-05 00:30:00 0
2011-01-05 01:00:00 0
2011-01-05 01:30:00 0
2011-01-05 02:00:00 1
2011-01-05 02:30:00 1
2011-01-05 03:00:00 1
2011-01-05 03:30:00 1
2011-01-05 04:00:00 0
我可以通过循环破解某些东西,但是有没有理智的方法?
谢谢。
答案 0 :(得分:2)
将Series.reindex
与ffill
一起使用:
df = df_60.reindex(df_30.index, method='ffill')
print (df)
val
2011-01-05 00:00:00 0
2011-01-05 00:30:00 0
2011-01-05 01:00:00 0
2011-01-05 01:30:00 0
2011-01-05 02:00:00 1
2011-01-05 02:30:00 1
2011-01-05 03:00:00 1
2011-01-05 03:30:00 1
2011-01-05 04:00:00 0
使用merge_asof
的另一种解决方案:
df = pd.merge_asof(df_30, df_60, left_index=True, right_index=True)
print (df)
val_x val_y
2011-01-05 00:00:00 0 0
2011-01-05 00:30:00 0 0
2011-01-05 01:00:00 0 0
2011-01-05 01:30:00 0 0
2011-01-05 02:00:00 0 1
2011-01-05 02:30:00 0 1
2011-01-05 03:00:00 0 1
2011-01-05 03:30:00 0 1
2011-01-05 04:00:00 0 0