Keras的基本二进制分类不起作用

时间:2018-12-18 11:16:33

标签: python tensorflow machine-learning keras classification

我是ML的新手,并且希望使用Keras执行最简单的分类:如果y> 0.5,则label = 1(x无关),y <0.5则label = 0(x无关)

据我了解,1个具有S型激活的神经元可以进行这种线性分类。

import tensorflow.keras as keras
import math

import numpy as np
import matplotlib as mpl

train_data = np.empty((0,2),float)
train_labels = np.empty((0,1),float)


train_data = np.append(train_data, [[0, 0]], axis=0)
train_labels = np.append(train_labels, 0)

train_data = np.append(train_data, [[1, 0]], axis=0)
train_labels = np.append(train_labels, 0)

train_data = np.append(train_data, [[0, 1]], axis=0)
train_labels = np.append(train_labels, 1)

train_data = np.append(train_data, [[1, 1]], axis=0)
train_labels = np.append(train_labels, 1)


model = keras.models.Sequential()
model.add(keras.layers.BatchNormalization())
model.add(keras.layers.Dense(1, input_dim = 2, activation='sigmoid'))

model.compile(optimizer='adam',
              loss='binary_crossentropy',
              metrics=['accuracy'])

model.fit(train_data, train_labels, epochs=20)

培训:

Epoch 1/5
4/4 [==============================] - 1s 150ms/step - loss: 0.4885 - acc: 0.7500
Epoch 2/5
4/4 [==============================] - 0s 922us/step - loss: 0.4880 - acc: 0.7500
Epoch 3/5
4/4 [==============================] - 0s 435us/step - loss: 0.4875 - acc: 0.7500
Epoch 4/5
4/4 [==============================] - 0s 396us/step - loss: 0.4869 - acc: 0.7500
Epoch 5/5
4/4 [==============================] - 0s 465us/step - loss: 0.4863 - acc: 0.7500

而且预测不好:

predict_data = np.empty((0,2),float)
predict_data = np.append(predict_data, [[0, 0]], axis=0)
predict_data = np.append(predict_data, [[1, 0]], axis=0)
predict_data = np.append(predict_data, [[1, 1]], axis=0)
predict_data = np.append(predict_data, [[1, 1]], axis=0)

predict_labels = model.predict(predict_data)
print(predict_labels)

[[0.49750862]
 [0.51616406]
 [0.774486  ]
 [0.774486  ]]

如何解决这个问题?

毕竟,我尝试在2000点上训练模型(在我看来,这对于这个简单的问题已经足够了),但是没有成功...

train_data = np.empty((0,2),float)
train_labels = np.empty((0,1),float)

for i in range(0, 1000):
  train_data = np.append(train_data, [[i, 0]], axis=0)
  train_labels = np.append(train_labels, 0)
  train_data = np.append(train_data, [[i, 1]], axis=0)
  train_labels = np.append(train_labels, 1)

model = keras.models.Sequential()
model.add(keras.layers.BatchNormalization())
model.add(keras.layers.Dense(1, input_dim = 2, activation='sigmoid'))

model.compile(optimizer='adam',
              loss='binary_crossentropy',
              metrics=['accuracy'])

model.fit(train_data, train_labels, epochs=5)

Epoch 1/5
2000/2000 [==============================] - 1s 505us/step - loss: 7.9669 - acc: 0.5005
Epoch 2/5
2000/2000 [==============================] - 0s 44us/step - loss: 7.9598 - acc: 0.5010
Epoch 3/5
2000/2000 [==============================] - 0s 45us/step - loss: 7.9511 - acc: 0.5010
Epoch 4/5
2000/2000 [==============================] - 0s 50us/step - loss: 7.9408 - acc: 0.5010
Epoch 5/5
2000/2000 [==============================] - 0s 53us/step - loss: 7.9279 - acc: 0.5015

<tensorflow.python.keras.callbacks.History at 0x7f4bdbdbda90>

预测:

predict_data = np.empty((0,2),float)
predict_data = np.append(predict_data, [[0, 0]], axis=0)
predict_data = np.append(predict_data, [[1, 0]], axis=0)
predict_data = np.append(predict_data, [[1, 1]], axis=0)
predict_data = np.append(predict_data, [[1, 1]], axis=0)

predict_labels = model.predict(predict_data)
print(predict_labels)

[[0.6280617 ]
 [0.48020774]
 [0.8395983 ]
 [0.8395983 ]]

(0,0)的0.6280617非常糟糕。

2 个答案:

答案 0 :(得分:1)

您的问题设置有点奇怪,因为您只有四个数据点,但想学习具有梯度下降(或亚当)的模型权重。另外,batchnorm在这里没有任何意义,因此我建议将其删除。

除此之外,您的网络正在预测0到1(“概率”)之间的数字,而不是类别标签。要获取预测的类标签,可以使用RadioButton代替model.predict_classes(predict_data)

如果您不熟悉ML,并且想尝试玩具数据集,还可以看看scikit-learn,它是一个实现更传统的ML算法的库,而Keras专为深度学习而设计。例如考虑逻辑回归,它与具有S形激活的单个神经元相同,但是在sklearn中通过不同的求解器实现:

model.predict()

scikit-learn网站包含许多示例,这些示例说明了玩具数据集上的这些不同算法。

在第二种情况下,您不允许第二项功能有任何变化,这是唯一重要的功能。如果要在1000个数据点上训练模型,则可以围绕原始数据集中的四个点生成数据,并向这些点添加一些随机噪声:

from sklearn.linear_model import LogisticRegression

model  = LogisticRegression()
model = model.fit(train_data, train_labels)
model.predict(predict_data)
> array([0., 0., 1., 1.])

enter image description here

import keras
import numpy as np
import matplotlib.pyplot as plt

# Generate toy dataset
train_data = np.random.randint(0, 2, size=(1000, 2))
# Add gaussian noise
train_data = train_data + np.random.normal(scale=2e-1, size=train_data.shape)
train_labels = (train_data[:, 1] > 0.5).astype(int)

# Visualize the data, color-coded by their classes
fig, ax = plt.subplots()
ax.scatter(train_data[:, 0], train_data[:, 1], c=train_labels)

您可以使用历史对象来可视化训练过程中损失或准确性的演变方式:

# Train a simple neural net
model = keras.models.Sequential()
model.add(keras.layers.Dense(1, input_shape= (2,), activation='sigmoid'))
model.compile(optimizer='sgd', loss='binary_crossentropy', metrics=['accuracy'])

history = model.fit(train_data, train_labels, epochs=20)

enter image description here

最后,在一些测试数据上测试模型:

fig, ax = plt.subplots()
ax.plot(history.history['acc'])

但是,请注意,仅使用第二个坐标就可以解决整个问题。因此,如果您丢掉第一个,就可以了:

from sklearn.metrics import accuracy_score
# Test on test data
test_data = np.random.randint(0, 2, size=(100, 2))
# Add gaussion noise
test_data = test_data + np.random.normal(scale=2e-1, size=test_data.shape)
test_labels = (test_data[:, 1] > 0.5).astype(int)

accuracy_score(test_labels, model.predict_classes(test_data[:, 1]))

此模型可快速实现高精度: enter image description here

答案 1 :(得分:0)

是的,首先,BatchNorm和Adam在这种情况下并没有什么意义。而且您的预测不起作用的原因是,您的模型太弱而无法求解方程。如果您尝试通过数学方法求解,则将具有:

sigmoid(w1*x1+w2+x2+b0) = y

通过训练数据,您将得到:

1) sigmoid(b0) = 0 => b0 = -infinite
2) sigmoid(w1+b0) = 0 => w1 = constant
3) sigmoid(w2+b0) = 1 => w2 >> |b0| (already starting to break...)
4) sigmoid(w1+w2+b0) = 1 => same as 3

因此,我认为培训师将开始在2到3之间摆动,开始将彼此之间的幅度提高到另一个,而您将永远无法使用该模型进行预测

如果您看到75%的准确度,那将是有道理的,因为您有4个训练示例,并且如上所述,不可能进行一次预测,所以您将获得3/4 acc