层:
(None,75)
(75,3)
(3,1)
对于最后一层,输出必须计算为( (H21*w1)*(H22*w2)*(H23*w3))
,其中H21,H22,H23
是隐藏层2的结果,而w1,w2,w3
将是恒定权重,不可训练。那么如何为上述结果编写lambda函数
def product(X):
return X[0]*X[1]
keras_model = Sequential()
keras_model.add(Dense(75,
input_dim=75,activation='tanh',name="layer1" ))
keras_model.add(Dense(3 ,activation='tanh',name="layer2" ))
keras_model.add(Dense(1,name="layer3"))
cross1=keras_model.add(Lambda(lambda x:product,output_shape=(1,1)))([layer2,layer3])
print(cross1)
NameError:未定义名称“ layer2”
答案 0 :(得分:0)
使用功能性API模型
inputs = Input((75,)) #shape (batch, 75)
output1 = Dense(75, activation='tanh',name="layer1" )(inputs) #shape (batch, 75)
output2 = Dense(3 ,activation='tanh',name="layer2" )(output1) #shape (batch, 3)
output3 = Dense(1,name="layer3")(output2) #shape (batch, 1)
cross1 = Lambda(lambda x: x[0] * x[1])([output2, output3]) #shape (batch, 3)
model = Model(inputs, cross1)
请注意,形状与您期望的完全不同。
答案 1 :(得分:0)
我建议您通过自定义层而不是Lambda
层来执行此操作。为什么?量身定制的产品将使您有更多的自由做东西,并且在查看所需的重量方面也更加透明。更准确地说,如果通过Lambda
层进行操作,则恒定权重将不会保存为模型的一部分,但如果使用自定义层,则将被保存。
from keras import backend as K
from keras.layers import *
from keras.models import *
import numpy as np
class MyLayer(Layer) :
# see https://keras.io/layers/writing-your-own-keras-layers/
def __init__(self,
w_vec=None,
allow_training=False,
**kwargs) :
self._w_vec = w_vec
assert allow_training or (w_vec is not None), \
"ERROR: non-trainable w_vec must be initialized"
self.allow_training = allow_training
super().__init__(**kwargs)
return
def build(self, input_shape) :
batch_size, num_feats = input_shape
self.w_vec = self.add_weight(shape=(1, num_feats),
name='w_vec',
initializer='uniform', # <- use your own preferred initializer
trainable=self.allow_training,)
if self._w_vec is not None :
# predefined w_vec
assert self._w_vec.shape[1] == num_feats, \
"ERROR: initial w_vec shape mismatches the input shape"
# set it to the weight
self.set_weights([self._w_vec]) # <- set weights to the supplied one
super().build(input_shape)
return
def call(self, x) :
# Given:
# x = [H21, H22, H23]
# w_vec = [w1, w2, w3]
# Step 1: output elem_prod
# elem_prod = [H21*w1, H22*w2, H23*w3]
elem_prod = x * self.w_vec
# Step 2: output ret
# ret = (H21*w1) * (H22*w2) * (H23*w3)
ret = K.prod(elem_prod, axis=-1, keepdims=True)
return ret
def compute_output_shape(self, input_shape) :
return (input_shape[0], 1)
def make_test_cases(w_vec=None, allow_training=False):
x = Input(shape=(75,))
y = Dense(75, activation='tanh', name='fc1')(x)
y = Dense(3, activation='tanh', name='fc2')(y)
y = MyLayer(w_vec, allow_training, name='core')(y)
y = Dense(1, name='fc3')(y)
net = Model(inputs=x, outputs=y, name='{}-{}'.format( 'randomInit' if w_vec is None else 'assignInit',
'trainable' if allow_training else 'nontrainable'))
print(net.name)
print(net.layers[-2].get_weights()[0])
print(net.summary())
return net
然后您可以运行以下测试用例来查看差异(注意打印输出中的第一行和最后一行,分别为您提供初始值和常数参数的数量)
m1 = make_test_cases(w_vec=np.arange(3).reshape([1,3]), allow_training=False)
会给你
assignInit-nontrainable [[0. 1. 2.]]
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_4 (InputLayer) (None, 75) 0
_________________________________________________________________
fc1 (Dense) (None, 75) 5700
_________________________________________________________________
fc2 (Dense) (None, 3) 228
_________________________________________________________________
core (MyLayer) (None, 1) 3
_________________________________________________________________
fc3 (Dense) (None, 1) 2
=================================================================
Total params: 5,933
Trainable params: 5,930
Non-trainable params: 3
_________________________________________________________________
m2 = make_test_cases(w_vec=np.arange(3).reshape([1,3]), allow_training=True)
会给你
assignInit-trainable [[0. 1. 2.]]
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_5 (InputLayer) (None, 75) 0
_________________________________________________________________
fc1 (Dense) (None, 75) 5700
_________________________________________________________________
fc2 (Dense) (None, 3) 228
_________________________________________________________________
core (MyLayer) (None, 1) 3
_________________________________________________________________
fc3 (Dense) (None, 1) 2
=================================================================
Total params: 5,933
Trainable params: 5,933
Non-trainable params: 0
_________________________________________________________________
m3 = make_test_cases(w_vec=None, allow_training=True)
会给你
randomInit-trainable [[ 0.02650297 -0.02010062 -0.03771694]]
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_6 (InputLayer) (None, 75) 0
_________________________________________________________________
fc1 (Dense) (None, 75) 5700
_________________________________________________________________
fc2 (Dense) (None, 3) 228
_________________________________________________________________
core (MyLayer) (None, 1) 3
_________________________________________________________________
fc3 (Dense) (None, 1) 2
=================================================================
Total params: 5,933
Trainable params: 5,933
Non-trainable params: 0
_________________________________________________________________
我会说,目前尚不清楚哪种情况可能会更好地解决您的问题,但是尝试所有这三种听起来都不错。