我正在尝试在时间序列数据之间重新采样和内插。由于某些原因,我不断获得NA的行。
import pandas as pd
track = pd.DataFrame({'latitude': [32.743680275092686, 32.75168725999735,
32.7782220355535, 32.783715199616,
32.78683419954979, 32.80517578125,
32.809039616988876, 32.82146906448623,
32.83804166114936, 32.839903750662074],
'longitude': [-111.19133774115113, -111.205463020169,
-111.25232307278381, -111.26207624162947,
-111.26762545838648, -111.30013602120536,
-111.30697444993622, -111.32877894810267,
-111.35798240194515, -111.36134556361606],
'altitude': [21125.0, 20975.0,
20175.0, 20000.0,
19900.0, 19350.0,
19225.0, 18850.0,
18325.0, 18250.0]},
index=[pd.Timestamp('2016-10-20 20:57:24.189484'),
pd.Timestamp('2016-10-20 20:57:32.102171'),
pd.Timestamp('2016-10-20 20:57:58.872157'),
pd.Timestamp('2016-10-20 20:58:04.522744'),
pd.Timestamp('2016-10-20 20:58:07.806292'),
pd.Timestamp('2016-10-20 20:58:26.574050'),
pd.Timestamp('2016-10-20 20:58:30.495372'),
pd.Timestamp('2016-10-20 20:58:43.221657'),
pd.Timestamp('2016-10-20 20:59:00.496501'),
pd.Timestamp('2016-10-20 20:59:02.362993')])
track.resample('S').interpolate('linear')
重新采样和内插的结果完全用NaN填充:
latitude longitude altitude
2016-10-20 20:57:24 NaN NaN NaN
2016-10-20 20:57:25 NaN NaN NaN
2016-10-20 20:57:26 NaN NaN NaN
2016-10-20 20:57:27 NaN NaN NaN
2016-10-20 20:57:28 NaN NaN NaN
...
我怀疑想要获取具有毫秒分辨率的点并在它们之间的1秒范围内进行插值会很复杂。我可以将索引舍入到最接近的秒数,并使结果更接近我的期望:
track.index = track.index.round('1s')
track.resample('S').interpolate('linear')
结果:
latitude longitude altitude
2016-10-20 20:57:24 32.743680 -111.191338 21125.000000
2016-10-20 20:57:25 32.744681 -111.193103 21106.250000
2016-10-20 20:57:26 32.745682 -111.194869 21087.500000
2016-10-20 20:57:27 32.746683 -111.196635 21068.750000
2016-10-20 20:57:28 32.747684 -111.198400 21050.000000
...
舍入会引入一定程度的误差,尽管在这种情况下很小。有更好的方法吗?