我有一个时间序列不平衡的数据集,必须在其上执行二进制分类。我不能随机分配训练和测试集,甚至不能对其进行分层。问题在于,虽然训练模型验证的准确性始终为1。我意识到这与训练测试拆分有关,但我可能是错的。
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.8, random_state=None, shuffle=False)
from collections import Counter
print(Counter(y))
print(Counter(y_train))
print(Counter(y_test))
Counter({0.0: 55534, 1.0: 10000})
Counter({0.0: 9995, 1.0: 3111})
Counter({0.0: 45539, 1.0: 6889})
model = Sequential()
#First Hidden Layer
model.add(Dense(128, activation='relu', kernel_initializer='random_normal', input_dim=19))
#model.add(Dropout(0.3))
#Second Hidden Layer
model.add(Dense(64, activation='relu', kernel_initializer='random_normal'))
#Output Layer
model.add(Dense(1, activation='sigmoid', kernel_initializer='random_normal'))
history = model.fit(X_train,y_train, batch_size=128, validation_split=0.1, epochs=50)
Train on 11795 samples, validate on 1311 samples
Epoch 1/50
11795/11795 [==============================] - 0s 34us/step - loss: 1.1359 - accuracy: 0.8719 - val_loss: 4.2016e-18 - val_accuracy: 1.0000
Epoch 2/50
11795/11795 [==============================] - 0s 12us/step - loss: 0.1247 - accuracy: 0.9442 - val_loss: 1.0255e-19 - val_accuracy: 1.0000
Epoch 3/50
11795/11795 [==============================] - 0s 13us/step - loss: 0.1177 - accuracy: 0.9462 - val_loss: 3.2516e-23 - val_accuracy: 1.0000
Epoch 4/50
11795/11795 [==============================] - 0s 12us/step - loss: 0.1103 - accuracy: 0.9519 - val_loss: 1.1607e-23 - val_accuracy: 1.0000
Epoch 5/50
11795/11795 [==============================] - 0s 13us/step - loss: 0.0805 - accuracy: 0.9739 - val_loss: 6.2345e-26 - val_accuracy: 1.0000
感谢有关此问题的任何帮助。谢谢