Question

我使用this repository和keras_contrib.crf为自然语言构建了CustomELMo + BiLSTM + CRF序列分类器。

它的工作原理很棒，但会遭受负面损失，这在理论上是错误的。 here和here中已经讨论了这个问题，该解决方案似乎正在使用掩码。但是，我必须在自定义ELMo嵌入层中注释掉compute_mask函数，因为它开始训练然后抛出：

InvalidArgumentError: Incompatible shapes: [32,47] vs. [32,0] [[{{node loss/crf_1_loss/mul_6}}]]

其中32是批处理大小，47比我指定的max_length小1（大概意味着一旦填充令牌被屏蔽，它将重新计算max_len）。

compute_mask函数的输出为暗（？，1）。这似乎是错误的，我想我需要将out_mask重塑为3D，以匹配Embeddings的输出形状（将dict lookup设置为“ elmo”，输出形状为（batch_size，max_length，1024），这应该是正确的因为BiLSTM需要3D输入）。

因此，我尝试了另一个compute_mask函数（在下面注释），该函数会生成暗淡的掩码（？，1、1）。这似乎也是错误的，而且可以肯定的是，在模型甚至无法开始训练之前，我就知道了：

AssertionError: Input mask to CRF must have dim 2 if not None

因此，我不确定要重点关注两个错误中的哪个以及如何解决它们。我在下面包括了最重要的代码。如有需要，很乐意与整个事物和/或完整堆栈进行git repo。

自定义ELMo层：

class ElmoEmbeddingLayer(Layer):
    def __init__(self, **kwargs):
        self.dimensions = 1024
        self.trainable = True
        super(ElmoEmbeddingLayer, self).__init__(**kwargs)

    def build(self, input_shape):
        self.elmo = hub.Module('https://tfhub.dev/google/elmo/2', trainable=self.trainable, name="{}_module".format(self.name))
    self.trainable_weights += K.tf.trainable_variables(scope="^{}_module/.*".format(self.name))
    super(ElmoEmbeddingLayer, self).build(input_shape)

    def call(self, x, mask=None):
        result = self.elmo(K.squeeze(K.cast(x, tf.string), axis=1),
                   as_dict=True, signature='default',)['elmo']
        return result

    # Original compute_mask function. Raises; 
    # InvalidArgumentError: Incompatible shapes: [32,47] vs. [32,0]      [[{{node loss/crf_1_loss/mul_6}}]]
     def compute_mask(self, inputs, mask=None):
         return K.not_equal(inputs, '__PAD__')

    # Alternative compute_mask function. Raises:
    # AssertionError: Input mask to CRF must have dim 2 if not None
    # def compute_mask(self, inputs, mask=None):
        # out_mask = K.not_equal(inputs, '__PAD__')
        # out_mask = K.expand_dims(out_mask)
        # return out_mask

    def compute_output_shape(self, input_shape):
        return input_shape[0], 48, self.dimensions

该模型的构建如下：

    def build_model(): # uses crf from keras_contrib
        input = layers.Input(shape=(1,), dtype=tf.string)
        model = ElmoEmbeddingLayer(name='ElmoEmbeddingLayer')(input)
        model = Bidirectional(LSTM(units=512, return_sequences=True))(model)
        crf = CRF(num_tags)
        out = crf(model)
        model = Model(input, out)
        model.compile(optimizer="rmsprop", loss=crf_loss, metrics=[crf_accuracy, categorical_accuracy, mean_squared_error])
        model.summary()
        return model

如何解决InvalidArgumentError：形状不兼容：使用自定义ELMo图层蒙版

0 个答案: