我正在尝试通过Numpy复制使用Keras的model.predict()
获得的输出。我的keras模型图层如下:
_________________________________________________________________
Layer (type) Output Shape Param
=================================================================
main_input (InputLayer) (None, 10, 76) 0
_________________________________________________________________
masking (Masking) (None, 10, 76) 0
_________________________________________________________________
rnn (SimpleRNN) [(None, 64), (None, 64)] 9024
_________________________________________________________________
dropout_15 (Dropout) (None, 64) 0
_________________________________________________________________
dense1 (Dense) (None, 64) 4160
_________________________________________________________________
denseoutput (Dense) (None, 1) 65
=================================================================
Total params: 13,249
Trainable params: 13,249
Non-trainable params: 0
SimpleRNN层的第二个输出是return_state=True
返回的状态。
我尝试了2种不同的方法。首先,我计算了 WXt + Us + b ,其中 W 是内核, Xt 是输入, U 是循环内核, s 是通过return_state=True
获得的状态,而 b 是偏差。这返回了与通过predict()
(函数mult_1
)获得的输出相似的输出。
在那之后,我尝试使用功能mult_2
的类似方法,但是获得的结果要比mult_1
差。
def mult_1(X):
X = ma.masked_values(X, -99)
s = (model.predict(X)[1])
W = (model.get_weights()[0])
U = (model.get_weights()[1])
b = (model.get_weights()[2])
Wx = np.dot(X[:,-1,:], W)
Us = np.dot(s,U)
output = Wx + Us + b
return np.tanh(output)
def mult2(X):
max_habitantes = X.shape[1]
i = 0
s_0 = np.ones((X.shape[0], 64)) # initial state
X = ma.masked_values(X, -99)
while i < 10:
Xt = X[:,i,:]
if i == 0:
s = s_0
else:
s = output
W = (model.get_weights()[0])
U = (model.get_weights()[1])
b = (model.get_weights()[2])
Wx = np.dot(Xt, W)
Us = np.dot(s,U)
output = np.tanh(Wx + Us +b)
i = i+1
return output
尽管与predict()
的预测没什么不同,但预测有些偏离。我是在错误地做乘法吗?
答案 0 :(得分:0)
您应该使用零数组作为mult_2中rnn的初始状态。 以下两个代码段将为您提供相同的结果:
x = np.random.rand(1,10,76)
使用Keras model.predict()
inputs = Input(shape=(10,76), dtype=np.float32)
_, state = SimpleRNN(units=64, return_state=True)(inputs)
out_drop = Dropout(0.2)(state)
out_d1 = Dense(64, activation='tanh')(out_drop)
out = Dense(1, activation='tanh')(out_d1)
model = Model(inputs, out)
In [1]: model.predict(x)
Out[1]: array([[-0.82426485]]
使用numpy函数进行预测:
def rnn_pred(X):
"""
Same as your mult_2 func. but with zero init. for rnn initial state
"""
W = (model.get_weights()[0])
U = (model.get_weights()[1])
b = (model.get_weights()[2])
max_habitantes = X.shape[1]
i = 0
s_0 = np.zeros((X.shape[0], 64)) # initial state
while i < 10:
Xt = X[:,i,:]
if i == 0:
s = s_0
else:
s = output
Wx = np.dot(Xt, W)
Us = np.dot(s,U)
output = np.tanh(Wx+Us+b)
i = i+1
return output
def dense_pred(rnn_out):
U_d1 = (model.get_weights()[3]) # dense64 weights
b_d1 = (model.get_weights()[4]) # dense64 bias
U_d2 = (model.get_weights()[5]) # dense1 weights
b_d2 = (model.get_weights()[6]) # dense1 bias
out1 = np.dot(rnn_out, U_d1) + b_d1
out1 = np.tanh(out1)
out2 = np.dot(out1, U_d2) + b_d2
out2 = np.tanh(out2)
return out2
In [2]: dense_pred(rnn_pred(x))
Out[2]: array([[-0.82426485]])