使用 Huggingface 包中的 XLNet 转换器训练模型

时间:2020-12-21 21:20:38

标签: python tensorflow keras huggingface-transformers

我想在模型中包含一个预训练的 XLNet(或可能是另一个最先进的转换器)以对其进行微调。

但是,当我将它包含在 keras 层中时它不起作用。

import tensorflow as tf
from transformers import AutoTokenizer, TFAutoModel

inputs = tf.keras.Input(shape=2000, dtype='int32')
x = inputs
xlnetPretrainedModel = TFAutoModel.from_pretrained("xlnet-base-cased")
x = xlnetPretrainedModel(x)
x = tf.keras.layers.GlobalAveragePooling1D()(x)
x = tf.keras.layers.Dense(32, activation='relu')(x)
x = tf.keras.layers.Dense(32, activation=None)(x)
model = tf.keras.Model(inputs=inputs, outputs=x)
model.compile(optimizer='adam',
                      loss='mean_squared_error')
model.summary()

错误是

AttributeError: 'NoneType' object has no attribute 'shape'

在线

x = xlnetPretrainedModel(x)

所以当模型用于输入层时。

如果在 numpy 数组上使用 XLNet 模型可以工作,但我将无法训练它。

完整的错误信息是:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-23-d543506f9697> in <module>
      5 x = inputs
      6 xlnetPretrainedModel = TFAutoModel.from_pretrained("xlnet-base-cased")
----> 7 x = xlnetPretrainedModel(x)
      8 x = tf.keras.layers.GlobalAveragePooling1D()(x)
      9 x = tf.keras.layers.Dense(32, activation='relu')(x)

/opt/conda/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/base_layer.py in __call__(self, inputs, *args, **kwargs)
    771                     not base_layer_utils.is_in_eager_or_tf_function()):
    772                   with auto_control_deps.AutomaticControlDependencies() as acd:
--> 773                     outputs = call_fn(cast_inputs, *args, **kwargs)
    774                     # Wrap Tensors in `outputs` in `tf.identity` to avoid
    775                     # circular dependencies.

/opt/conda/lib/python3.7/site-packages/tensorflow_core/python/autograph/impl/api.py in wrapper(*args, **kwargs)
    235       except Exception as e:  # pylint:disable=broad-except
    236         if hasattr(e, 'ag_error_metadata'):
--> 237           raise e.ag_error_metadata.to_exception(e)
    238         else:
    239           raise

AttributeError: in converted code:

    /opt/conda/lib/python3.7/site-packages/transformers/modeling_tf_xlnet.py:810 call  *
        outputs = self.transformer(inputs, **kwargs)
    /opt/conda/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/base_layer.py:805 __call__
        inputs, outputs, args, kwargs)
    /opt/conda/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/base_layer.py:2014 _set_connectivity_metadata_
        input_tensors=inputs, output_tensors=outputs, arguments=arguments)
    /opt/conda/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/base_layer.py:2044 _add_inbound_node
        arguments=arguments)
    /opt/conda/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/node.py:110 __init__
        self.output_shapes = nest.map_structure(backend.int_shape, output_tensors)
    /opt/conda/lib/python3.7/site-packages/tensorflow_core/python/util/nest.py:568 map_structure
        structure[0], [func(*x) for x in entries],
    /opt/conda/lib/python3.7/site-packages/tensorflow_core/python/util/nest.py:568 <listcomp>
        structure[0], [func(*x) for x in entries],
    /opt/conda/lib/python3.7/site-packages/tensorflow_core/python/keras/backend.py:1172 int_shape
        shape = x.shape

    AttributeError: 'NoneType' object has no attribute 'shape'

或在尝试通过 tf.function 修饰调用https://github.com/huggingface/transformers/issues/1350 此处提供的解决方案后

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-16-c852fba5aa15> in <module>
      8 xlnetPretrainedModel = TFAutoModel.from_pretrained("xlnet-base-cased")
      9 xlnetPretrainedModel.call = tf.function(xlnetPretrainedModel.transformer.call)
---> 10 x = xlnetPretrainedModel(x)
     11 
     12 x = tf.keras.layers.GlobalAveragePooling1D()(x)

/opt/conda/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/base_layer.py in __call__(self, inputs, *args, **kwargs)
    803               kwargs.pop('mask')
    804             inputs, outputs = self._set_connectivity_metadata_(
--> 805                 inputs, outputs, args, kwargs)
    806           self._handle_activity_regularization(inputs, outputs)
    807           self._set_mask_metadata(inputs, outputs, input_masks)

/opt/conda/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/base_layer.py in _set_connectivity_metadata_(self, inputs, outputs, args, kwargs)
   2012     # This updates the layer history of the output tensor(s).
   2013     self._add_inbound_node(
-> 2014         input_tensors=inputs, output_tensors=outputs, arguments=arguments)
   2015     return inputs, outputs
   2016 

/opt/conda/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/base_layer.py in _add_inbound_node(self, input_tensors, output_tensors, arguments)
   2042         input_tensors=input_tensors,
   2043         output_tensors=output_tensors,
-> 2044         arguments=arguments)
   2045 
   2046     # Update tensor history metadata.

/opt/conda/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/node.py in __init__(self, outbound_layer, inbound_layers, node_indices, tensor_indices, input_tensors, output_tensors, arguments)
    108     self.input_shapes = nest.map_structure(backend.int_shape, input_tensors)
    109     # Nested structure of shape tuples, shapes of output_tensors.
--> 110     self.output_shapes = nest.map_structure(backend.int_shape, output_tensors)
    111 
    112     # Optional keyword arguments to layer's `call`.

/opt/conda/lib/python3.7/site-packages/tensorflow_core/python/util/nest.py in map_structure(func, *structure, **kwargs)
    566 
    567   return pack_sequence_as(
--> 568       structure[0], [func(*x) for x in entries],
    569       expand_composites=expand_composites)
    570 

/opt/conda/lib/python3.7/site-packages/tensorflow_core/python/util/nest.py in <listcomp>(.0)
    566 
    567   return pack_sequence_as(
--> 568       structure[0], [func(*x) for x in entries],
    569       expand_composites=expand_composites)
    570 

/opt/conda/lib/python3.7/site-packages/tensorflow_core/python/keras/backend.py in int_shape(x)
   1170   """
   1171   try:
-> 1172     shape = x.shape
   1173     if not isinstance(shape, tuple):
   1174       shape = tuple(shape.as_list())

AttributeError: 'NoneType' object has no attribute 'shape'

请问,谁能帮我解决这个错误?

1 个答案:

答案 0 :(得分:0)

通常,Huggingface Transformer 库中的模型输出因模型而异。检查文档以了解从 XLNet 的“call()”函数返回哪些值。​​

最后一个隐藏状态一般可以通过以下方式访问:

model= TFAutoModel.from_pretrained("xlnet-base-cased")
# 'last_hidden_state' seems to be common to most of the transformer models
# refer: https://huggingface.co/transformers/model_doc/bert.html#tfbertmodel
output = model(tokenizer_ouput).last_hidden_state
x = tf.keras.layers.Dense(32, activation='relu')(x)
# The rest of you model

进行此更改应该可以解决您的问题。