使用pandas获取错误切片时间序列

时间:2017-08-03 08:40:33

标签: python pandas numpy time-series

我试图切片时间序列,我可以这样完美地完成:

subseries = series['2015-07-07 01:00:00':'2015-07-07 03:30:00'] .

但以下代码无法正常工作

def GetDatetime():

    Y = int(raw_input("Year "))
    M = int(raw_input("Month "))
    D = int(raw_input("Day "))
    d = datetime.datetime(Y, M, D) #creates a datetime object
    return d

filePath = "pathtofile.csv"
series = pd.read_csv(str(filePath), index_col='date') 
series.index = pd.to_datetime(series.index, unit='s')

d = GetDatetime()
f = GetDatetime()

subseries = series[d:f]

最后一行产生此错误:

Traceback (most recent call last):
  File "dontgivemeerrorsbrasommek.py", line 37, in <module>
    brasla7nina= df[d:f]
  File "/usr/local/lib/python2.7/dist-packages/pandas-0.20.2-py2.7-linux-x86_64.egg/pandas/core/frame.py", line 1952, in __getitem__
    indexer = convert_to_index_sliceable(self, key)
  File "/usr/local/lib/python2.7/dist-packages/pandas-0.20.2-py2.7-linux-x86_64.egg/pandas/core/indexing.py", line 1896, in convert_to_index_sliceable
    return idx._convert_slice_indexer(key, kind='getitem')
  File "/usr/local/lib/python2.7/dist-packages/pandas-0.20.2-py2.7-linux-x86_64.egg/pandas/core/indexes/base.py", line 1407, in _convert_slice_indexer
    indexer = self.slice_indexer(start, stop, step, kind=kind)
  File "/usr/local/lib/python2.7/dist-packages/pandas-0.20.2-py2.7-linux-x86_64.egg/pandas/core/indexes/datetimes.py", line 1515, in slice_indexer
    return Index.slice_indexer(self, start, end, step, kind=kind)
  File "/usr/local/lib/python2.7/dist-packages/pandas-0.20.2-py2.7-linux-x86_64.egg/pandas/core/indexes/base.py", line 3350, in slice_indexer
    kind=kind)
  File "/usr/local/lib/python2.7/dist-packages/pandas-0.20.2-py2.7-linux-x86_64.egg/pandas/core/indexes/base.py", line 3538, in slice_locs
    start_slice = self.get_slice_bound(start, 'left', kind)
  File "/usr/local/lib/python2.7/dist-packages/pandas-0.20.2-py2.7-linux-x86_64.egg/pandas/core/indexes/base.py", line 3487, in get_slice_bound
    raise err
KeyError: 1435802520000000000

我认为这是一个时间戳转换问题所以我尝试了以下但仍然无法正常工作:

d3 = pandas.Timestamp(datetime(Y, M, D, H, m))
d2 = pandas.to_datetime(d)

非常感谢您的帮助,谢谢。 :)

2 个答案:

答案 0 :(得分:4)

def GetDatetime()函数返回值更改为:

return str(d)

这将返回日期时间字符串,系列将能够处理这些字符串。

答案 1 :(得分:0)

如果我正确理解您的代码,当您这样做时:

subseries = series['2015-07-07 01:00:00':'2015-07-07 03:30:00']

你正在从两个字符串中切片series(顺便说一句,因为有一个pandas数据类型Series,这令人困惑。)

如果有效,那么subseries= df[d:f]所需要的是df为字符串。

你可以通过调用datetime方法.strftime()来做到这一点,例如:

d= GetDatetime().strftime('%Y-%m-%d 00:00:00')
f= GetDatetime().strftime('%Y-%m-%d 00:00:00')