将采样的softmax集成到keras中失败

时间:2019-02-18 23:01:23

标签: tensorflow keras

基于How can I use TensorFlow's sampled softmax loss function in a Keras model?,我创建了以下代码:

class SampledSoftmax(tensorflow.keras.layers.Layer):
    def __init__(self, **kwargs):
        super(SampledSoftmax, self).__init__(**kwargs)


    def call(self, inputs):

        def f1(inputs):
            return tf.nn.sampled_softmax_loss(
                                inputs[0]._keras_history[0].weights[0],
                                inputs[0]._keras_history[0].bias,
                                tf.reshape(tf.argmax(inputs[1], 1), [-1, 1]),
                                inputs[0],                                
                                8192,
                                817496)
        def f2(inputs):
            logits = tf.matmul(inputs[0], tf.transpose(inputs[0]._keras_history[0].weights[0]))
            logits = tf.nn.bias_add(logits, inputs[0]._keras_history[0].bias)            
            return tf.nn.softmax_cross_entropy_with_logits_v2(
                                labels=inputs[1],
                                logits=logits)    

        return tf.cond(K.learning_phase(), true_fn=f1(inputs), false_fn=f2(inputs))

,并与以下模型一起使用:

#model
input_layer = Input(shape=(None,), dtype='int32')
target_input = Input(shape=(None,vocab_size), dtype='int8')

embedding_layer = Embedding(vocab_size,
                            EMBEDDING_DIM,
                            trainable=True,
                            mask_zero=True) (input_layer)
common = LSTM(LSTM_UNITS, return_sequences=True,dropout=0.2, recurrent_dropout=0.2)(embedding_layer)
common = (Dense(PROJ_UNITS, activation='linear'))(common)
out = (Dense(vocab_size, name='output_layer'))(common)
out = (SampledSoftmax())([out, target_input])


model = Model(inputs=[input_layer,target_input], outputs=out)

由于以下错误而失败: ValueError:形状必须为2级,但对于“ sampled_softmax / sampled_softmax_loss / MatMul”(运算符:“ MatMul”),输入形状为[?,?,817496],[?, 817496]为3级。

基于Google搜索,我取得了一些进展:

class MyLayer(tensorflow.keras.layers.Dense):
    def __init__(self, num_sampled, num_classes, mode,  **kwargs):
        self.num_sampled = num_sampled
        self.num_classes = num_classes
        self.mode = mode
        super(MyLayer, self).__init__(num_classes, **kwargs)
        self.input_spec = [InputSpec(ndim=2)]

    def build(self, input_shape):
        #self.input_spec = [InputSpec(shape=input_shape)]
        super(MyLayer, self).build(input_shape)  # Be sure to call this somewhere!

    def call(self, inputs_and_labels):
        inputs, labels = inputs_and_labels
        if self.mode == "train":
            loss = tf.nn.sampled_softmax_loss(
                weights=self.kernel,
                biases=self.bias,
                labels=tf.reshape(tf.argmax(labels, 1), [-1, 1]),
                inputs=inputs,
                num_sampled=self.num_sampled,
                num_classes=self.num_classes,
                num_true=1)

        elif self.mode == "eval":
            logits = tf.matmul(inputs, tf.transpose(self.kernel))
            logits = tf.nn.bias_add(logits, self.bias)
            loss = tf.nn.softmax_cross_entropy_with_logits(
                labels=labels,
                logits=logits)

        return loss

    def compute_output_shape(self, input_shape):
        dense_shape, classes_shape = input_shape
        return (dense_shape[0], )    

和现在的错误: 现在的错误:

ValueError: Layer my_layer expects 1 inputs, but it received 2 input tensors. Inputs received: [<tf.Tensor 'dense/BiasAdd:0' shape=(?, ?, 512) dtype=float32>, <tf.Tensor 'input_2:0' shape=(?, ?, 817496) dtype=int8>]

我尝试使用self.input_spec,但直到现在都无法使用。

0 个答案:

没有答案