我有一个存储在CSV中的时间序列,并将其转储到DataFrame中,看起来像这样
time station_id station_name value
0 2019-05-08 00:10:00+00:00 9018823 XXXXXXXX 11
1 2019-05-08 00:20:00+00:00 9018823 XXXXXXXX 10
2 2019-05-08 00:30:00+00:00 9018823 XXXXXXXX 9
3 2019-05-08 00:40:00+00:00 9018823 XXXXXXXX 9
4 2019-05-08 00:50:00+00:00 9018823 XXXXXXXX 9
我正在使用Pandasto填补白天缺少的空缺,我只想从2019-05-08 00:00:00+00:00
到2019-05-08 23:50:00+00:00
每天做。我用以下内容填补了空白,但我无法填补00:00
上缺少的内容。
data = data.set_index(keys=['time']).resample('10min', fill_method='ffill')
这是我可以用熊猫做的事吗?
更新
按照reindex
的建议进行尝试,我得到了整个时间范围,但所得的DataFrame的值均具有NaN。
date_str = data['time'].iloc[0].strftime('%Y-%m-%d')
time_range = pd.date_range(date_str, date_str + ' 23:59:00', freq='10T')
data = (data.set_index(keys=['time'])
.resample('10min').ffill()
.reindex(time_range).bfill())
station_id station_name value
2019-05-08 00:00:00 NaN NaN NaN
2019-05-08 00:10:00 NaN NaN NaN
2019-05-08 00:20:00 NaN NaN NaN
2019-05-08 00:30:00 NaN NaN NaN
2019-05-08 00:40:00 NaN NaN NaN
2019-05-08 00:50:00 NaN NaN NaN
2019-05-08 01:00:00 NaN NaN NaN
2019-05-08 01:10:00 NaN NaN NaN
2019-05-08 01:20:00 NaN NaN NaN
2019-05-08 01:30:00 NaN NaN NaN
2019-05-08 01:40:00 NaN NaN NaN
2019-05-08 01:50:00 NaN NaN NaN
答案 0 :(得分:0)
尝试reindex
:
# day of data
date_str = data['time'].iloc[0].strftime('%Y-%m-%d')
time_range = pd.date_range(date_str, date_str + ' 23:59:00', freq='10T')
data = (data.set_index(keys=['time'])
.resample('10min', fill_method='ffill')
.reindex(time_range).bfill())
答案 1 :(得分:0)
功能
interpolate
有几种不同的填充方法和说明,也许可以尝试一下吗?
date_range = pd.date_range(firstDate, lastDate, freq='10Min')
df = df.reindex( date_range, fill_value=np.NaN)
df = df.interpolate(method='pad', limit_direction='forward', axis=1)