为了通过TensorFlow后端了解Keras以进行进一步的研究项目,我尝试为简单的分类问题实现NN。我只想区分randomly distributed points in 2D into two categories depending on their coordinates(颜色表示类别)。 生成数据的相关代码是:
import numpy as np
np.random.seed(40)
def createData(N=120, M=75):
train_x1 = np.random.random(size=N)
train_x2 = np.random.random(size=N)
test_x1 = np.random.random(size=M)
test_x2 = np.random.random(size=M)
train_x = np.zeros((N, 2))
train_y = np.zeros((N, 1))
test_x = np.zeros((M, 2))
test_y = np.zeros((M, 1))
for i in range(N):
train_x[i][0] = train_x1[i]
train_x[i][1] = train_x2[i]
if train_x1[i] < 0.5:
if train_x2[i] < 0.5:
train_y[i][0] = 1
else:
train_y[i][0] = 2
else:
if train_x2[i] < 0.5:
train_y[i][0] = 2
else:
train_y[i][0] = 1
for j in range(M):
test_x[j][0] = test_x1[j]
test_x[j][1] = test_x2[j]
if test_x1[j] < 0.5:
if test_x2[j] < 0.5:
test_y[j][0] = 1
else:
test_y[j][0] = 2
else:
if test_x2[j] < 0.5:
test_y[j][0] = 2
else:
test_y[j][0] = 1
return train_x, train_y, test_x, test_y
我的神经网络代码如下:
from keras.models import Sequential
from keras.layers import Dense
import numpy as np
import matplotlib.pyplot as plt
X, Y, x, y = createData()
# Model
model = Sequential()
model.add(Dense(4, input_dim=2, activation='relu'))
model.add(Dense(10, activation='sigmoid'))
model.add(Dense(1, activation='sigmoid'))
# Compile
model.compile(loss='binary_crossentropy', optimizer='sgd', metrics=['accuracy'])
# Fit
model.fit(X, Y, epochs=500, batch_size=25)
# Evaluation
scores = model.evaluate(X, Y)
print("\n%s: %.2f%%" % (model.metrics_names[1], scores[1]*100))
# Predictions
predictions = model.predict(x)
rounded = [round(z[0]) for z in predictions]
print(rounded)
我的问题是,使用上述种子和配置,神经网络达到47.5%的准确度。它指定all test coordinates to only one category。但是,其他配置(更多层,更多/更少的神经元,其他激活函数,其他损失函数等)会产生类似的结果。
&#34;纪元监视器的最后一行&#34;在shell上看起来像这样:
Epoch 490/500
120/120 [==============================] - 0s - loss: -8.3351 - acc: 0.4750
Epoch 491/500
120/120 [==============================] - 0s - loss: -8.1866 - acc: 0.4750
Epoch 492/500
120/120 [==============================] - 0s - loss: -8.3524 - acc: 0.4750
Epoch 493/500
120/120 [==============================] - 0s - loss: -8.2608 - acc: 0.4750
Epoch 494/500
120/120 [==============================] - 0s - loss: -8.3269 - acc: 0.4750
Epoch 495/500
120/120 [==============================] - 0s - loss: -8.2039 - acc: 0.4750
Epoch 496/500
120/120 [==============================] - 0s - loss: -8.1786 - acc: 0.4750
Epoch 497/500
120/120 [==============================] - 0s - loss: -8.2488 - acc: 0.4750
Epoch 498/500
120/120 [==============================] - 0s - loss: -8.3090 - acc: 0.4750
Epoch 499/500
120/120 [==============================] - 0s - loss: -8.3457 - acc: 0.4750
Epoch 500/500
120/120 [==============================] - 0s - loss: -8.1235 - acc: 0.4750
如何改善神经网络以摆脱这种愚蠢的行为?非常感谢您的任何意见和建议!
答案 0 :(得分:0)
首先,我会检查您的输入数据是否已正确创建/标记。尝试打印出几个值以确保它有意义。
其次,当学习率很高时,我通常会看到这个问题。您可以尝试删除中心有10个神经元的隐藏层或降低学习率。我认为很多层的网络对于这个问题来说太过分了。尝试将此用于编译行。
model.compile(loss='binary_crossentropy', optimizer=keras.optimizers.Adam(lr=0.0001), metrics=['accuracy'])
学习率为0.001是Adam的默认值,但我发现如果0.001不会使用0.0001收敛,那么