我是ML的新手,并且希望使用Keras执行最简单的分类:如果y> 0.5,则label = 1(x无关),y <0.5则label = 0(x无关)>
据我了解,1个具有S型激活的神经元可以进行这种线性分类。
import tensorflow.keras as keras
import math
import numpy as np
import matplotlib as mpl
train_data = np.empty((0,2),float)
train_labels = np.empty((0,1),float)
train_data = np.append(train_data, [[0, 0]], axis=0)
train_labels = np.append(train_labels, 0)
train_data = np.append(train_data, [[1, 0]], axis=0)
train_labels = np.append(train_labels, 0)
train_data = np.append(train_data, [[0, 1]], axis=0)
train_labels = np.append(train_labels, 1)
train_data = np.append(train_data, [[1, 1]], axis=0)
train_labels = np.append(train_labels, 1)
model = keras.models.Sequential()
model.add(keras.layers.BatchNormalization())
model.add(keras.layers.Dense(1, input_dim = 2, activation='sigmoid'))
model.compile(optimizer='adam',
loss='binary_crossentropy',
metrics=['accuracy'])
model.fit(train_data, train_labels, epochs=20)
培训:
Epoch 1/5
4/4 [==============================] - 1s 150ms/step - loss: 0.4885 - acc: 0.7500
Epoch 2/5
4/4 [==============================] - 0s 922us/step - loss: 0.4880 - acc: 0.7500
Epoch 3/5
4/4 [==============================] - 0s 435us/step - loss: 0.4875 - acc: 0.7500
Epoch 4/5
4/4 [==============================] - 0s 396us/step - loss: 0.4869 - acc: 0.7500
Epoch 5/5
4/4 [==============================] - 0s 465us/step - loss: 0.4863 - acc: 0.7500
而且预测不好:
predict_data = np.empty((0,2),float)
predict_data = np.append(predict_data, [[0, 0]], axis=0)
predict_data = np.append(predict_data, [[1, 0]], axis=0)
predict_data = np.append(predict_data, [[1, 1]], axis=0)
predict_data = np.append(predict_data, [[1, 1]], axis=0)
predict_labels = model.predict(predict_data)
print(predict_labels)
[[0.49750862]
[0.51616406]
[0.774486 ]
[0.774486 ]]
如何解决这个问题?
毕竟,我尝试在2000点上训练模型(在我看来,这对于这个简单的问题已经足够了),但是没有成功...
train_data = np.empty((0,2),float)
train_labels = np.empty((0,1),float)
for i in range(0, 1000):
train_data = np.append(train_data, [[i, 0]], axis=0)
train_labels = np.append(train_labels, 0)
train_data = np.append(train_data, [[i, 1]], axis=0)
train_labels = np.append(train_labels, 1)
model = keras.models.Sequential()
model.add(keras.layers.BatchNormalization())
model.add(keras.layers.Dense(1, input_dim = 2, activation='sigmoid'))
model.compile(optimizer='adam',
loss='binary_crossentropy',
metrics=['accuracy'])
model.fit(train_data, train_labels, epochs=5)
Epoch 1/5
2000/2000 [==============================] - 1s 505us/step - loss: 7.9669 - acc: 0.5005
Epoch 2/5
2000/2000 [==============================] - 0s 44us/step - loss: 7.9598 - acc: 0.5010
Epoch 3/5
2000/2000 [==============================] - 0s 45us/step - loss: 7.9511 - acc: 0.5010
Epoch 4/5
2000/2000 [==============================] - 0s 50us/step - loss: 7.9408 - acc: 0.5010
Epoch 5/5
2000/2000 [==============================] - 0s 53us/step - loss: 7.9279 - acc: 0.5015
<tensorflow.python.keras.callbacks.History at 0x7f4bdbdbda90>
预测:
predict_data = np.empty((0,2),float)
predict_data = np.append(predict_data, [[0, 0]], axis=0)
predict_data = np.append(predict_data, [[1, 0]], axis=0)
predict_data = np.append(predict_data, [[1, 1]], axis=0)
predict_data = np.append(predict_data, [[1, 1]], axis=0)
predict_labels = model.predict(predict_data)
print(predict_labels)
[[0.6280617 ]
[0.48020774]
[0.8395983 ]
[0.8395983 ]]
(0,0)的0.6280617非常糟糕。
答案 0 :(得分:1)
您的问题设置有点奇怪,因为您只有四个数据点,但想学习具有梯度下降(或亚当)的模型权重。另外,batchnorm在这里没有任何意义,因此我建议将其删除。
除此之外,您的网络正在预测0到1(“概率”)之间的数字,而不是类别标签。要获取预测的类标签,可以使用RadioButton
代替model.predict_classes(predict_data)
。
如果您不熟悉ML,并且想尝试玩具数据集,还可以看看scikit-learn,它是一个实现更传统的ML算法的库,而Keras专为深度学习而设计。例如考虑逻辑回归,它与具有S形激活的单个神经元相同,但是在sklearn中通过不同的求解器实现:
model.predict()
scikit-learn网站包含许多示例,这些示例说明了玩具数据集上的这些不同算法。
在第二种情况下,您不允许第二项功能有任何变化,这是唯一重要的功能。如果要在1000个数据点上训练模型,则可以围绕原始数据集中的四个点生成数据,并向这些点添加一些随机噪声:
from sklearn.linear_model import LogisticRegression
model = LogisticRegression()
model = model.fit(train_data, train_labels)
model.predict(predict_data)
> array([0., 0., 1., 1.])
import keras
import numpy as np
import matplotlib.pyplot as plt
# Generate toy dataset
train_data = np.random.randint(0, 2, size=(1000, 2))
# Add gaussian noise
train_data = train_data + np.random.normal(scale=2e-1, size=train_data.shape)
train_labels = (train_data[:, 1] > 0.5).astype(int)
# Visualize the data, color-coded by their classes
fig, ax = plt.subplots()
ax.scatter(train_data[:, 0], train_data[:, 1], c=train_labels)
您可以使用历史对象来可视化训练过程中损失或准确性的演变方式:
# Train a simple neural net
model = keras.models.Sequential()
model.add(keras.layers.Dense(1, input_shape= (2,), activation='sigmoid'))
model.compile(optimizer='sgd', loss='binary_crossentropy', metrics=['accuracy'])
history = model.fit(train_data, train_labels, epochs=20)
最后,在一些测试数据上测试模型:
fig, ax = plt.subplots()
ax.plot(history.history['acc'])
但是,请注意,仅使用第二个坐标就可以解决整个问题。因此,如果您丢掉第一个,就可以了:
from sklearn.metrics import accuracy_score
# Test on test data
test_data = np.random.randint(0, 2, size=(100, 2))
# Add gaussion noise
test_data = test_data + np.random.normal(scale=2e-1, size=test_data.shape)
test_labels = (test_data[:, 1] > 0.5).astype(int)
accuracy_score(test_labels, model.predict_classes(test_data[:, 1]))
答案 1 :(得分:0)
是的,首先,BatchNorm和Adam在这种情况下并没有什么意义。而且您的预测不起作用的原因是,您的模型太弱而无法求解方程。如果您尝试通过数学方法求解,则将具有:
sigmoid(w1*x1+w2+x2+b0) = y
通过训练数据,您将得到:
1) sigmoid(b0) = 0 => b0 = -infinite
2) sigmoid(w1+b0) = 0 => w1 = constant
3) sigmoid(w2+b0) = 1 => w2 >> |b0| (already starting to break...)
4) sigmoid(w1+w2+b0) = 1 => same as 3
因此,我认为培训师将开始在2到3之间摆动,开始将彼此之间的幅度提高到另一个,而您将永远无法使用该模型进行预测
如果您看到75%的准确度,那将是有道理的,因为您有4个训练示例,并且如上所述,不可能进行一次预测,所以您将获得3/4 acc