当我重新采样某些数据时,我遇到了丢弃第一行的pandas的问题。请参阅下面的示例。请注意,如果您将最后一个时间戳向前推进1秒,它将按预期工作。
我正在使用pandas 0.10.1
import pandas as pd
from datetime import datetime
from StringIO import StringIO
f = StringIO('''\
time,value
2011-06-03 00:00:05,0
2011-06-03 00:01:05,1
2011-06-03 00:02:05,2
''')
series = pd.read_csv(f, parse_dates=True, index_col=0)['value']
print series
# time
# 2011-06-03 00:00:05 0
# 2011-06-03 00:01:05 1
# 2011-06-03 00:02:05 2
# Name: value
# Problem resampling: 1st sample is missing
print series.resample('s')
# time
# 2011-06-03 00:00:06 NaN
# 2011-06-03 00:00:07 NaN
# 2011-06-03 00:00:08 NaN
# 2011-06-03 00:00:09 NaN
# ...
# 2011-06-03 00:01:52 NaN
# 2011-06-03 00:02:03 NaN
# 2011-06-03 00:02:04 NaN
# 2011-06-03 00:02:05 2
# 2011-06-03 00:02:06 NaN
# Freq: S, Name: value, Length: 121
答案 0 :(得分:0)
已关闭的parm的默认值在0.11中更改,请参阅here。我不知道那里是否还有一个bug。您可以尝试指定关闭的间隔。
目前的熊猫版本为0.12(即将推出0.13)。最好的办法是升级。
从0.12开始。看起来不错。默认为关闭='左'
In [11]: df
Out[11]:
value
time
2011-06-03 00:00:05 0
2011-06-03 00:01:05 1
2011-06-03 00:02:05 2
In [12]: df.index
Out[12]:
<class 'pandas.tseries.index.DatetimeIndex'>
[2011-06-03 00:00:05, ..., 2011-06-03 00:02:05]
Length: 3, Freq: None, Timezone: None
In [13]: df.resample('1s')
Out[13]:
<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 121 entries, 2011-06-03 00:00:05 to 2011-06-03 00:02:05
Freq: S
Data columns (total 1 columns):
value 3 non-null values
dtypes: float64(1)
In [14]: df.resample('1s').head()
Out[14]:
value
time
2011-06-03 00:00:05 0
2011-06-03 00:00:06 NaN
2011-06-03 00:00:07 NaN
2011-06-03 00:00:08 NaN
2011-06-03 00:00:09 NaN