好的,从顶部开始,这是我使用的导入
import keras
from keras import layers
from keras.models import Sequential
import pandas as pd
from sklearn.model_selection import train_test_split
然后我使用熊猫从csv获取数据,然后将必要的字段分为X和y,还将其分为训练和测试集。
df = pd.read_csv('data/BCHAIN-NEW.csv')
y = df['Predict']
X = df[['Value USD', 'Drop 7', 'Up 7', 'Mean Change 7', 'Change']]
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.20, shuffle=False)
这没有改组,因此数据被平均分割
X_test.head()
>>>
Value USD Drop 7 Up 7 Mean Change 7 Change
2320 1023.14 5.0 2.0 -22.754286 -103.62
2321 1126.76 5.0 2.0 -4.470000 132.09
2322 994.67 5.0 2.0 9.865714 111.58
2323 883.09 5.0 2.0 9.005714 -13.74
2324 896.83 5.0 2.0 12.797143 -11.31
X_train.head()
>>>
Value USD Drop 7 Up 7 Mean Change 7 Change
0 0.06480 2.0 4.0 -0.000429 -0.00420
1 0.06900 1.0 5.0 0.000274 0.00403
2 0.06497 1.0 5.0 0.000229 0.00007
3 0.06490 1.0 5.0 0.000514 0.00200
4 0.06290 2.0 4.0 0.000229 -0.00050
现在像这样运行模型会引发索引错误
model = Sequential()
model.add(layers.Dense(100, activation='relu', input_shape=(5,)))
model.add(layers.Dense(100, activation='relu'))
model.add(layers.Dense(5, activation='softmax'))
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
model.fit(X_train, y_train, epochs=3)
>>>
Epoch 1/3
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
<ipython-input-38-868bc86350df> in <module>()
4 model.add(layers.Dense(5, activation='softmax'))
5 model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
----> 6 model.fit(X_train, y_train, epochs=3)
c:\users\samuel\appdata\local\programs\python\python35\lib\site-packages\keras\models.py in fit(self, x, y, batch_size, epochs, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, initial_epoch, **kwargs)
...
c:\users\samuel\appdata\local\programs\python\python35\lib\site-packages\pandas\core\indexing.py in _convert_to_indexer(self, obj, axis, is_setter)
1267 if mask.any():
1268 raise KeyError('{mask} not in index'
-> 1269 .format(mask=objarr[mask]))
1270
1271 return _values_from_object(indexer)
KeyError: '[1330 480 101 2009 1131 379 1498 2188 2121 700 1877 2011 2244 1262\n 1493 956 150 479 1345 1073 1173 1909 2260 2288 355 670 2143 1426\n 42 952 358 1183] not in index'
答案 0 :(得分:1)
在我看来,您的数据格式错误,需要使用numpy数组。 (假设它们不是已经准备好的numpy数组)
尝试像这样转换它们
x_train = np.array(x_train)
y_train = np.array(y_train)