如何做自定义keras图层矩阵乘法

时间:2019-12-27 17:58:49

标签: machine-learning keras keras-layer keras-2

层:

  • 输入形状(None,75)
  • 隐藏层1-形状为(75,3)
  • 隐藏层2-形状为(3,1)

对于最后一层,输出必须计算为( (H21*w1)*(H22*w2)*(H23*w3)),其中H21,H22,H23是隐藏层2的结果,而w1,w2,w3将是恒定权重,不可训练。那么如何为上述结果编写lambda函数

def product(X):
    return X[0]*X[1]

keras_model = Sequential()
keras_model.add(Dense(75, 
input_dim=75,activation='tanh',name="layer1" ))
keras_model.add(Dense(3 ,activation='tanh',name="layer2" ))
keras_model.add(Dense(1,name="layer3"))
cross1=keras_model.add(Lambda(lambda x:product,output_shape=(1,1)))([layer2,layer3])
print(cross1)        
  

NameError:未定义名称“ layer2”

2 个答案:

答案 0 :(得分:0)

使用功能性API模型

inputs = Input((75,))                                         #shape (batch, 75)
output1 = Dense(75, activation='tanh',name="layer1" )(inputs) #shape (batch, 75)
output2 = Dense(3 ,activation='tanh',name="layer2" )(output1) #shape (batch, 3)
output3 = Dense(1,name="layer3")(output2)                     #shape (batch, 1)

cross1 = Lambda(lambda x: x[0] * x[1])([output2, output3])    #shape (batch, 3)

model = Model(inputs, cross1)

请注意,形状与您期望的完全不同。

答案 1 :(得分:0)

我建议您通过自定义层而不是Lambda层来执行此操作。为什么?量身定制的产品将使您有更多的自由做东西,并且在查看所需的重量方面也更加透明。更准确地说,如果通过Lambda层进行操作,则恒定权重将不会保存为模型的一部分,但如果使用自定义层,则将被保存。

这里是一个例子

from keras import backend as K
from keras.layers import *
from keras.models import *
import numpy as np 


class MyLayer(Layer) :
    # see https://keras.io/layers/writing-your-own-keras-layers/
    def __init__(self, 
                 w_vec=None, 
                 allow_training=False,
                 **kwargs) :
        self._w_vec = w_vec
        assert allow_training or (w_vec is not None), \
            "ERROR: non-trainable w_vec must be initialized"
        self.allow_training = allow_training
        super().__init__(**kwargs)
        return
    def build(self, input_shape) :
        batch_size, num_feats = input_shape
        self.w_vec = self.add_weight(shape=(1, num_feats),
                                     name='w_vec',
                                     initializer='uniform', # <- use your own preferred initializer
                                     trainable=self.allow_training,)
        if self._w_vec is not None :
            # predefined w_vec
            assert self._w_vec.shape[1] == num_feats, \
                "ERROR: initial w_vec shape mismatches the input shape"
            # set it to the weight
            self.set_weights([self._w_vec])  # <- set weights to the supplied one
        super().build(input_shape)
        return
    def call(self, x) :
        # Given:
        #   x = [H21, H22, H23]
        #   w_vec = [w1, w2, w3]
        # Step 1: output elem_prod
        #   elem_prod = [H21*w1, H22*w2, H23*w3]
        elem_prod = x * self.w_vec
        # Step 2: output ret
        #   ret = (H21*w1) * (H22*w2) * (H23*w3)
        ret = K.prod(elem_prod, axis=-1, keepdims=True)
        return ret
    def compute_output_shape(self, input_shape) :
        return (input_shape[0], 1)

def make_test_cases(w_vec=None, allow_training=False):
    x = Input(shape=(75,))
    y = Dense(75, activation='tanh', name='fc1')(x)
    y = Dense(3, activation='tanh', name='fc2')(y)
    y = MyLayer(w_vec, allow_training, name='core')(y)
    y = Dense(1, name='fc3')(y)
    net = Model(inputs=x, outputs=y, name='{}-{}'.format( 'randomInit' if w_vec is None else 'assignInit',
                                                          'trainable' if allow_training else 'nontrainable'))
    print(net.name)
    print(net.layers[-2].get_weights()[0])
    print(net.summary())
    return net

然后您可以运行以下测试用例来查看差异(注意打印输出中的第一行和最后一行,分别为您提供初始值和常数参数的数量)

a。恒定重量,不可训练

m1 = make_test_cases(w_vec=np.arange(3).reshape([1,3]), allow_training=False)

会给你

assignInit-nontrainable [[0. 1. 2.]]
_________________________________________________________________  
Layer (type)                 Output Shape              Param # 

=================================================================  
input_4 (InputLayer)         (None, 75)                0         
_________________________________________________________________  
fc1 (Dense)                  (None, 75)                5700      
_________________________________________________________________  
fc2 (Dense)                  (None, 3)                 228       
_________________________________________________________________  
core (MyLayer)               (None, 1)                 3         
_________________________________________________________________  
fc3 (Dense)                  (None, 1)                 2         
=================================================================  
Total params: 5,933  
Trainable params: 5,930  
Non-trainable params: 3
_________________________________________________________________ 

b。恒定重量,可训练

m2 = make_test_cases(w_vec=np.arange(3).reshape([1,3]), allow_training=True)

会给你

assignInit-trainable    [[0. 1. 2.]]
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_5 (InputLayer)         (None, 75)                0         
_________________________________________________________________
fc1 (Dense)                  (None, 75)                5700      
_________________________________________________________________
fc2 (Dense)                  (None, 3)                 228       
_________________________________________________________________
core (MyLayer)               (None, 1)                 3         
_________________________________________________________________
fc3 (Dense)                  (None, 1)                 2         
=================================================================
Total params: 5,933
Trainable params: 5,933
Non-trainable params: 0
_________________________________________________________________

c。随机权重,可训练

m3 = make_test_cases(w_vec=None, allow_training=True)

会给你

randomInit-trainable [[ 0.02650297 -0.02010062 -0.03771694]]
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_6 (InputLayer)         (None, 75)                0         
_________________________________________________________________
fc1 (Dense)                  (None, 75)                5700      
_________________________________________________________________
fc2 (Dense)                  (None, 3)                 228       
_________________________________________________________________
core (MyLayer)               (None, 1)                 3         
_________________________________________________________________
fc3 (Dense)                  (None, 1)                 2         
=================================================================
Total params: 5,933
Trainable params: 5,933
Non-trainable params: 0
_________________________________________________________________

最后的话

我会说,目前尚不清楚哪种情况可能会更好地解决您的问题,但是尝试所有这三种听起来都不错。