在新数据集中进行预测

时间:2019-07-29 09:52:05

标签: python machine-learning keras deep-learning prediction

我建立了一个keras逻辑回归模型。我正在尝试找到一种方法,可以为我的模型提供新的数据集,并在通过的新数据集中提供预测。我的新数据集将与模型的形状相同

我的第二个问题是有一种方法可以提高模型的准确性,因为我的准确率是69%,当我打印分类报告时,我在一个班级中的准确度很差

X=new.drop('reassed',axis=1)
y=new['reassed'].astype(int)

分割数据

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 0)


from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)

# Initialising the ANN
classifier = Sequential()

# Adding the input layer and the first hidden layer
classifier.add(Dense(units = 27, kernel_initializer = 'uniform', activation = 'relu', input_dim = 6))

# Adding the second hidden layer
classifier.add(Dense(units = 27, kernel_initializer = 'uniform', activation = 'relu'))

# Adding the output layer
classifier.add(Dense(units = 1, kernel_initializer = 'uniform', activation = 'sigmoid'))

# Compiling the ANN
classifier.compile(optimizer = 'adam', loss = 'binary_crossentropy', metrics = ['accuracy'])`enter code here`

# Fitting the ANN to the Training set
classifier.fit(X_train, y_train, batch_size = 10, epochs = 20)


Epoch 1/20
16704/16704 [==============================] - 1s 76us/step - loss: 0.6159 - acc: 0.6959
Epoch 2/20
16704/16704 [==============================] - 1s 65us/step - loss: 0.6114 - acc: 0.6967
Epoch 3/20
16704/16704 [==============================] - 1s 65us/step - loss: 0.6110 - acc: 0.6964
Epoch 4/20
16704/16704 [==============================] - 1s 66us/step - loss: 0.6101 - acc: 0.6965
Epoch 5/20
16704/16704 [==============================] - 1s 66us/step - loss: 0.6091 - acc: 0.6961
Epoch 6/20
16704/16704 [==============================] - 1s 66us/step - loss: 0.6094 - acc: 0.6963
Epoch 7/20
16704/16704 [==============================] - 1s 68us/step - loss: 0.6086 - acc: 0.6967
Epoch 8/20
16704/16704 [==============================] - 1s 66us/step - loss: 0.6083 - acc: 0.6965
Epoch 9/20
16704/16704 [==============================] - 1s 65us/step - loss: 0.6081 - acc: 0.6964: 0s - loss: 0.6085 - acc: 
Epoch 10/20
16704/16704 [==============================] - 1s 66us/step - loss: 0.6082 - acc: 0.6971
Epoch 11/20
16704/16704 [==============================] - 1s 67us/step - loss: 0.6077 - acc: 0.6968
Epoch 12/20
16704/16704 [==============================] - 1s 66us/step - loss: 0.6073 - acc: 0.6971
Epoch 13/20
16704/16704 [==============================] - 1s 65us/step - loss: 0.6067 - acc: 0.6971
Epoch 14/20
16704/16704 [==============================] - 1s 66us/step - loss: 0.6070 - acc: 0.6965
Epoch 15/20
16704/16704 [==============================] - 1s 65us/step - loss: 0.6066 - acc: 0.6967: 0s - loss: 0.6053 - ac
Epoch 16/20
16704/16704 [==============================] - 1s 66us/step - loss: 0.6060 - acc: 0.6967
Epoch 17/20
16704/16704 [==============================] - 1s 67us/step - loss: 0.6061 - acc: 0.6968
Epoch 18/20
16704/16704 [==============================] - 1s 67us/step - loss: 0.6062 - acc: 0.6971
Epoch 19/20
16704/16704 [==============================] - 1s 69us/step - loss: 0.6057 - acc: 0.6968
Epoch 20/20
16704/16704 [==============================] - 1s 74us/step - loss: 0.6055 - acc: 0.6973

y_pred = classifier.predict(X_test)
y_pred = [ 1 if y>=0.5 else 0 for y in y_pred ]

print(classification_report(y_test, y_pred))

      precision    recall  f1-score   support

           0       0.71      1.00      0.83      2968
           1       0.33      0.00      0.01      1208

   micro avg       0.71      0.71      0.71      4176
   macro avg       0.52      0.50      0.42      4176
weighted avg       0.60      0.71      0.59      4176

我希望改善自己的模型

我希望找到一种可以在新数据集中进行预测的方法

1 个答案:

答案 0 :(得分:0)

要对新数据集进行预测

  1. 加载数据与加载测试集相同
  2. 应用在您的训练集上应用的所有每个处理步骤。
  3. 使用

    model.predict(X)

功能可以进行预测并进行后续处理。

几乎与测试集的预测相同。