我正在尝试在贷款状态数据集上应用深度学习网络,以检查是否可以获得比传统机器学习算法更好的结果。
准确性似乎非常低(甚至低于使用正常逻辑回归的准确性)。我该如何改善?
我尝试过的事情: -改变学习率 -增加层数 -增加/减少节点数**
X = df_dummies.drop('Loan_Status', axis=1).values
y = df_dummies['Loan_Status'].values
model = Sequential()
model.add(Dense(50, input_dim = 17, activation = 'relu'))
model.add(Dense(100, activation = 'relu'))
model.add(Dense(100, activation = 'relu'))
model.add(Dense(100, activation = 'relu'))
model.add(Dense(100, activation = 'relu'))
model.add(Dense(1, activation = 'sigmoid'))
sgd = optimizers.SGD(lr = 0.00001)
model.compile(optimizer=sgd, loss='binary_crossentropy', metrics=`['accuracy'])`
model.fit(X, y, epochs = 50, shuffle=True, verbose=2)
model.summary()
历次1/50 -1秒-损失:4.9835-累积:0.6873 时代2/50 -0秒-损失:4.9830-累积:0.6873 时代3/50 -0秒-损失:4.9821-累积:0.6873 时代4/50 -0秒-损失:4.9815-累积:0.6873 时代5/50 -0秒-损失:4.9807-累积:0.6873 时代6/50 -0秒-损失:4.9800-累积:0.6873 时代7/50 -0秒-损失:4.9713-累积:0.6873 时代8/50 -0秒-损失:8.5354-累积:0.4397 时代9/50 -0秒-损失:4.8322-累积:0.6743 时代10/50 -0秒-损失:4.9852-累积:0.6873 时代11/50 -0秒-损失:4.9852-累积:0.6873 时代12/50 -0秒-损失:4.9852-累积:0.6873 时代13/50 -0秒-损失:4.9852-累积:0.6873 时代14/50 -0秒-损失:4.9852-累积:0.6873 时代15/50 -0秒-损失:4.9852-累积:0.6873 时代16/50 -0秒-损失:4.9852-累积:0.6873 时代17/50 -0秒-损失:4.9852-累积:0.6873 时代18/50 -0秒-损失:4.9852-累积:0.6873 时代19/50 -0秒-损失:4.9852-累积:0.6873 时代20/50 -0秒-损失:4.9852-累积:0.6873 时代21/50 -0秒-损失:4.9852-累积:0.6873 时代22/50 -0秒-损失:4.9852-累积:0.6873 时代23/50 -0秒-损失:4.9852-累积:0.6873 时代24/50 -0秒-损失:4.9852-累积:0.6873 时代25/50 -0秒-损失:4.9852-累积:0.6873 时代26/50 -0秒-损失:4.9852-累积:0.6873 时代27/50 -0秒-损失:4.9852-累积:0.6873 时代28/50 -0秒-损失:4.9852-累积:0.6873 时代29/50 -0秒-损失:4.9852-累积:0.6873 时代30/50 -0秒-损失:4.9852-累积:0.6873 时代31/50 -0秒-损失:4.9852-累积:0.6873 时代32/50 -0秒-损失:4.9852-累积:0.6873 时代33/50 -0秒-损失:4.9852-累积:0.6873 时代34/50 -0秒-损失:4.9852-累积:0.6873 时代35/50 -0秒-损失:4.9852-累积:0.6873 时代36/50 -0秒-损失:4.9852-累积:0.6873 时代37/50 -0秒-损失:4.9852-累积:0.6873 时代38/50 -0秒-损失:4.9852-累积:0.6873 时代39/50 -0秒-损失:4.9852-累积:0.6873 时代40/50 -0秒-损失:4.9852-累积:0.6873 时代41/50 -0秒-损失:4.9852-累积:0.6873 时代42/50 -0秒-损失:4.9852-累积:0.6873 时代43/50 -0秒-损失:4.9852-累积:0.6873 时代44/50 -0秒-损失:4.9852-累积:0.6873 纪元45/50 -0秒-损失:4.9852-累积:0.6873 时代46/50 -0秒-损失:4.9852-累积:0.6873 时代47/50 -0秒-损失:4.9852-累积:0.6873 时代48/50 -0秒-损失:4.9852-累积:0.6873 时代49/50 -0秒-损失:4.9852-累积:0.6873 时代50/50 -0秒-损失:4.9852-累积:0.6873
Layer (type) Output Shape Param # ================================================================= dense_19 (Dense) (None, 50) 900 _________________________________________________________________ dense_20 (Dense) (None, 100) 5100 _________________________________________________________________ dense_21 (Dense) (None, 100) 10100 _________________________________________________________________ dense_22 (Dense) (None, 100) 10100 _________________________________________________________________ dense_23 (Dense) (None, 100) 10100 _________________________________________________________________ dense_24 (Dense) (None, 1) 101 ================================================================= Total params: 36,401 Trainable params: 36,401 Non-trainable params: 0 _________________________________________________________________
答案 0 :(得分:0)
**我可以通过使网络更深并添加辍学来略有改善,但是我仍然认为可以进一步改善,因为使用常规逻辑回归可以提供更好的准确性(80%+)。
有人知道进一步改善的方法吗?**
model = Sequential()
model.add(Dense(1000, input_dim = 17, activation = 'relu'))
model.add(Dropout(0.2))
model.add(Dense(1000, activation = 'relu'))
model.add(Dropout(0.2))
model.add(Dense(1000, activation = 'relu'))
model.add(Dropout(0.2))
model.add(Dense(1000, activation = 'relu'))
model.add(Dropout(0.2))
model.add(Dense(1000, activation = 'relu'))
model.add(Dropout(0.2))
model.add(Dense(1000, activation = 'relu'))
model.add(Dense(1, activation = 'sigmoid'))
sgd = optimizers.SGD(lr = 0.0001)
model.compile(optimizer=sgd, loss='binary_crossentropy', metrics=['accuracy'])
model.fit(X_train, y_train, epochs = 20, shuffle=True, verbose=2, batch_size=30)
Epoch 1/20
- 2s - loss: 4.8965 - acc: 0.6807
Epoch 2/20
- 1s - loss: 4.6824 - acc: 0.7063
Epoch 3/20
- 1s - loss: 4.6091 - acc: 0.7040
Epoch 4/20
- 1s - loss: 4.5642 - acc: 0.7040
Epoch 5/20
- 1s - loss: 4.6937 - acc: 0.7040
Epoch 6/20
- 1s - loss: 4.6830 - acc: 0.7063
Epoch 7/20
- 1s - loss: 4.6824 - acc: 0.7063
Epoch 8/20
- 1s - loss: 4.6824 - acc: 0.7063
Epoch 9/20
- 1s - loss: 4.6824 - acc: 0.7063
Epoch 10/20
- 1s - loss: 4.6452 - acc: 0.7086
Epoch 11/20
- 1s - loss: 4.6824 - acc: 0.7063
Epoch 12/20
- 1s - loss: 4.6824 - acc: 0.7063
Epoch 13/20
- 1s - loss: 4.7200 - acc: 0.7040
Epoch 14/20
- 1s - loss: 4.6608 - acc: 0.7063
Epoch 15/20
- 1s - loss: 4.6940 - acc: 0.7040
Epoch 16/20
- 1s - loss: 4.7136 - acc: 0.7040
Epoch 17/20
- 1s - loss: 4.6056 - acc: 0.7063
Epoch 18/20
- 1s - loss: 4.5640 - acc: 0.7016
Epoch 19/20
- 1s - loss: 4.7009 - acc: 0.7040
Epoch 20/20
- 1s - loss: 4.6892 - acc: 0.7040