我有一个Pandas DataFrame,其索引是一系列连续的日期。我试图遍历日期,但是却被抛出KeyError,我知道给定的键存在并且格式正确(Pandas Timestamp)
import pandas as pd
import datetime
## Importing the data from the Sep 2016-August 2018
## Step count & Date features only
features = ['Date','Step count']
data = pd.read_csv('fit_daily_sum_Sep2016_Aug2018.csv', sep=',', usecols=features).set_index('Date')
# To convert data index to datetime
data.index = pd.to_datetime(data.index)
tmp = data.head()
print tmp.index
print 'first key',tmp.index[0]
print type(tmp.index[0])
fkey = pd.Timestamp(2016,9,2)
print 'fkey is',fkey
for x in xrange(0, len(tmp)):
print 'running',fkey+datetime.timedelta(days=x)
print tmp[fkey+datetime.timedelta(days=x)]
最后一行的第一次迭代将引发KeyError。控制台显示如下(精简)
DatetimeIndex(['2016-09-02', '2016-09-03', '2016-09-04', '2016-09-05',
'2016-09-06'],
dtype='datetime64[ns]', name=u'Date', freq=None)
first key 2016-09-02 00:00:00
<class 'pandas._libs.tslibs.timestamps.Timestamp'>
fkey is 2016-09-02 00:00:00
running 2016-09-02 00:00:00
KeyError: Timestamp('2016-09-02 00:00:00')
在我看来,我正在提供确实存在的精确密钥,但是正在抛出KeyError!我不确定问题出在哪里。任何帮助将不胜感激。
答案 0 :(得分:1)
tmp[fkey+datetime.timedelta(days=x)]
这部分是在查看dataFrame的列标题,而不是索引。
尝试
tmp.loc(fkey+datetime.timedelta(days=x))
或
`tmp['Step count'][fkey+datetime.timedelta(days=x)]
#where 'Step count' is the column name of interest.