我正在寻找Keras模型的输出,以通过矩阵乘法手动计算预测值。我想这样做是为了帮助了解Keras的工作原理。我将使用简单的XOR问题。这是我的代码:
import numpy as np
import keras
from keras.models import Sequential
from keras.layers.core import Dense
from keras.callbacks import LambdaCallback
class LossHistory(keras.callbacks.Callback):
def on_train_begin(self, logs={}):
self.losses = []
def on_batch_end(self, batch, logs={}):
self.losses.append(logs.get('loss'))
history = LossHistory()
# the four different states of the XOR gate
training_data = np.array([[0,0],[0,1],[1,0],[1,1]], "float32")
# the four expected results in the same order
target_data = np.array([[0],[1],[1],[0]], "float32")
model = Sequential()
model.add(Dense(4, input_dim=2, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
print_weights = LambdaCallback(on_epoch_end=lambda batch, logs: print(model.layers[0].get_weights()))
model.compile(loss='mean_squared_error',
optimizer='adam',
metrics=['binary_accuracy'])
history2 = model.fit(training_data, target_data, epochs=50, verbose=2, callbacks=[print_weights, history])
print(model.predict(training_data).round())
W1 = model.get_weights()[0]
X1 = np.matrix([[0,0],[1,1]], "float32")
wx = np.dot(X1,W1)
b = model.get_weights()[1]
wx = np.reshape(wx,(4,2))
b = np.reshape(b, (4,1))
z = wx + b
from numpy import array, exp
a1 = 1 / (1 + exp(-z))
print('g =\n', a1)
W2 = model.get_weights()[2]
b2 = model.get_weights()[3]
W2 = np.reshape(W2,(1,4))
a1 = np.reshape(a1, (4,1))
wa = np.dot(W2,a1)
z2 = wa + b2
a2 = 1 / (1 + exp(-z2))
print('g =\n', a2)
据我了解,get_weights()[0]
和get_weights()[1]
分别是第一层的权重和偏差,而get_weights()[2]
和get_weights()[3]
是第一层的权重和偏差。第二层。我认为我遇到的问题是弄清楚x1和x2与方程z = Wx + b有关。权重是从最后一个纪元获取的,通常是达到100%准确性的权重。根据基于z = Wx + b的手动计算,然后采用z的S形,对y型帽子的预测,我期望的输出为[0,1,1,0]。
答案 0 :(得分:1)
您非常亲密!
首先,使用仅包含4个事件的训练集进行50个纪元不足以复制正确的输出(0,1,1,0),因此我将纪元数提高到1000。 以下是我用于十进制和四舍五入输出的代码:
import numpy as np
from keras.models import Sequential
from keras.layers.core import Dense
# Set seed for reproducibility
np.random.seed(1)
# the four different states of the XOR gate
training_data = np.array([[0,0],[0,1],[1,0],[1,1]], "float32")
# the four expected results in the same order
target_data = np.array([[0],[1],[1],[0]], "float32")
model = Sequential()
model.add(Dense(4, input_dim=2, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='mean_squared_error',optimizer='adam',metrics=['binary_accuracy'])
history = model.fit(training_data, target_data, epochs=1000, verbose=1)
# decimal output
print('decimal output:\n'+str(model.predict(training_data)))
# rounded output
print('rounded output:\n'+str(model.predict(training_data).round()))
# ouputs:
decimal output:
[[ 0.25588933]
[ 0.82657152]
[ 0.83840138]
[ 0.16465074]]
rounded output:
[[ 0.]
[ 1.]
[ 1.]
[ 0.]]
模型给出正确的舍入输出,很好!十进制输出可用于比较手动方法。
对于手动方法,X1是模型的输入,[0,0],[0,1],[1,0]或[1,1]。 X2是第一层的输出,而输入到最后一层。权重和偏差与您所说的完全相同(“ get_weights()[0]和get_weights()[1]分别是第一层的权重和偏差,以及get_weights()[2]和get_weights()[3]是第二层的权重和偏差”)。但是好像您忘记了第一层的relu activation function吗?让我们看一下解决方案代码:
# Parameters layer 1
W1 = model.get_weights()[0]
b1 = model.get_weights()[1]
# Parameters layer 2
W2 = model.get_weights()[2]
b2 = model.get_weights()[3]
# Input
X1 = np.array([[0,0],[0,1],[1,0],[1,1]], "float32")
# Use the following X1 for single input instead of all at once
#X1 = np.array([[0,0]])
# First layer calculation
L1 = np.dot(X1,W1)+b1
# Relu activation function
X2 = np.maximum(L1,0)
# Second layer calculation
L2 = np.dot(X2,W2)+b2
# Sigmoid
output = 1/(1+np.exp(-L2))
# decimal output
print('decimal output:\n'+str(output))
# rounded output
print('rounded output:\n'+str(output.round()))
# ouputs:
decimal output:
[[ 0.25588933]
[ 0.82657152]
[ 0.83840144]
[ 0.16465074]]
rounded output:
[[ 0.]
[ 1.]
[ 1.]
[ 0.]]
您可以如上所述同时使用所有4个输入,也可以按照注释#X1的建议仅使用一个输入。请注意,十进制的“ model.predict”输出和手动方法给出的输出完全相同(第三个值的偏差很小,可能是由于某些keras / numpy舍入偏差引起的?)