我正在使用递归神经网络基于以前的风数据进行每小时风速预测。我试图使用.shift
将数据向后移动1小时,以生成此数据。我的DateFrame
看起来像[this] [1]我的代码是:
import numpy as np
import pandas as pd
from pandas import DataFrame
wind_p = [0, 0.03454225, 0.02062136, 0.00186715, 0.01517354, 0.0129046,
0.02231125, 0.01492537, 0.09646542, 0.28444476]
Speed = [0, 2.25226244, 1.44078451, 0.99174488, 0.71179491, 0.92824542, 1.67776948, 2.96399534, 5.06257161, 7.06504245]
Date = ['2012-01-01 01:00:00' ,'2012-01-01 02:00:00', '2012-01-01 03:00:00', '2012-01-01 04:00:00',
'2012-01-01 05:00:00', '2012-01-01 06:00:00', '2012-01-01 07:00:00',
'2012-01-01 08:00:00', '2012-01-01 09:00:00', '2012-01-01 10:00:00']
df = pd.DataFrame({'date':Date,'wind_P':wind_p,'Speed':Speed})
dates=[datetime.strptime(x,'%Y-%m-%d %H:%M:%S') for x in Date]
df['t']= [x for x in range(10)]
df['t+1'] = df[Speed].shift(-1)
print(df)
我从中得到的错误消息是:
KeyError Traceback (most recent call last)
<ipython-input-1-bb88ddb20ff4> in <module>()
18
19 df['t']= [x for x in range(10)]
---> 20 df['t+1'] = df[Speed].shift(-1)
21 print(df)
~/anaconda3_501/lib/python3.6/site-packages/pandas/core/frame.py in __getitem__(self, key)
1956 if isinstance(key, (Series, np.ndarray, Index, list)):
1957 # either boolean or fancy integer index
-> 1958 return self._getitem_array(key)
1959 elif isinstance(key, DataFrame):
1960 return self._getitem_frame(key)
~/anaconda3_501/lib/python3.6/site-packages/pandas/core/frame.py in _getitem_array(self, key)
2000 return self.take(indexer, axis=0, convert=False)
2001 else:
-> 2002 indexer = self.loc._convert_to_indexer(key, axis=1)
2003 return self.take(indexer, axis=1, convert=True)
2004
~/anaconda3_501/lib/python3.6/site-packages/pandas/core/indexing.py in _convert_to_indexer(self, obj, axis, is_setter)
1229 mask = check == -1
1230 if mask.any():
-> 1231 raise KeyError('%s not in index' % objarr[mask])
1232
1233 return _values_from_object(indexer)
KeyError: '[0. 2.25226244 1.44078451 0.99174488 0.71179491 0.92824542\n 1.67776948 2.96399534 5.06257161 7.06504245] not in index'
Help with shifting the data in column `Speed` back by 1 step would be appreciated!
[1]: https://i.stack.imgur.com/oANWb.jpg
答案 0 :(得分:1)
KeyError: '[0. 2.25226244 1.44078451 0.99174488 0.71179491 0.92824542\n 1.67776948 2.96399534 5.06257161 7.06504245] not in index'
该错误表明Speed(列表)正在尝试为df对象建立索引。速度不是有效的索引,因此会导致错误。
您似乎想要使用字符串'Speed'
,通过将df[Speed]
更改为df['Speed']
将其视为df的索引。
答案 1 :(得分:1)
In [114]: df['t+1'] = df['Speed'].shift(-1)
# NOTE: ^ ^
In [115]: df
Out[115]:
date wind_P Speed t t+1
0 2012-01-01 01:00:00 0.000000 0.000000 0 2.252262
1 2012-01-01 02:00:00 0.034542 2.252262 1 1.440785
2 2012-01-01 03:00:00 0.020621 1.440785 2 0.991745
3 2012-01-01 04:00:00 0.001867 0.991745 3 0.711795
4 2012-01-01 05:00:00 0.015174 0.711795 4 0.928245
5 2012-01-01 06:00:00 0.012905 0.928245 5 1.677769
6 2012-01-01 07:00:00 0.022311 1.677769 6 2.963995
7 2012-01-01 08:00:00 0.014925 2.963995 7 5.062572
8 2012-01-01 09:00:00 0.096465 5.062572 8 7.065042
9 2012-01-01 10:00:00 0.284445 7.065042 9 NaN