我使用keras
和keras_contrib
(带有后者的CRF来实现,它带有条件随机场层(BiLSTM-CRF))实现了双向长短期记忆神经网络。原生keras functionality
。该任务被命名为“实体识别”,分为6类之一。网络的输入是300维预训练的GloVe词嵌入的序列。这是我的模型摘要:
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) (None, 648) 0
_________________________________________________________________
embedding_1 (Embedding) (None, 648, 300) 1500000
_________________________________________________________________
bidirectional_1 (Bidirection (None, 648, 10000) 3204000
_________________________________________________________________
crf_1 (CRF) (None, 648, 6) 6054
=================================================================
现在,我想在TensorFlow
1.15中实现相同的模型。由于keras_contrib CRF模块仅适用于keras,而不适用于TensorFlow,因此我使用了this回购中为TensorFlow
1.X构建的CRF实现。该仓库包含两个不错的CRF here示例实现,但是在对我的数据进行训练时,每个都会产生不同的错误。
实施1
from tensorflow.keras.layers import Bidirectional, Embedding, LSTM, TimeDistributed
from tensorflow.keras.models import Sequential
from tf_crf_layer.layer import CRF
from tf_crf_layer.loss import crf_loss
from tf_crf_layer.metrics import crf_accuracy
MAX_WORDS = 50000
EMBEDDING_LENGTH = 300
MAX_SEQUENCE_LENGTH = 648
HIDDEN_SIZE = 512
model = Sequential()
model.add(Embedding(MAX_WORDS, EMBEDDING_LENGTH, input_length=MAX_SEQUENCE_LENGTH, mask_zero=True, weights=[embedding_matrix], trainable=False))
model.add(Bidirectional(LSTM(HIDDEN_SIZE, return_sequences=True)))
model.add(CRF(len(labels)))
model.compile('adam', loss=crf_loss, metrics=[crf_accuracy])
这是我尝试编译模型时遇到的错误:
File "/.../tf_crf_layer/metrics/crf_accuracy.py", line 48, in crf_accuracy
crf, idx = y_pred._keras_history[:2]
AttributeError: 'Tensor' object has no attribute '_keras_history'
根据上述存储库计算crf_accuracy
时会出现错误。
def crf_accuracy(y_true, y_pred):
"""
Get default accuracy based on CRF `test_mode`.
"""
import pdb; pdb.set_trace()
crf, idx = y_pred._keras_history[:2]
if crf.test_mode == 'viterbi':
return crf_viterbi_accuracy(y_true, y_pred)
else:
return crf_marginal_accuracy(y_true, y_pred)
根据this线程,当张量对象不是keras层的输出时,显然会发生这种错误。为什么这里会出现此错误?
实施2
from tf_crf_layer.layer import CRF
from tf_crf_layer.loss import crf_loss, ConditionalRandomFieldLoss
from tf_crf_layer.metrics import crf_accuracy
from tf_crf_layer.metrics.sequence_span_accuracy import SequenceSpanAccuracy
model = Sequential()
model.add(Embedding(MAX_WORDS, EMBEDDING_LENGTH, input_length=MAX_SEQUENCE_LENGTH, mask_zero=True, weights=[embedding_matrix], trainable=False))
model.add(Bidirectional(LSTM(HIDDEN_SIZE, return_sequences=True)))
model.add(CRF(len(labels), name="crf_layer"))
model.summary()
crf_loss_instance = ConditionalRandomFieldLoss()
model.compile(loss={"crf_layer": crf_loss_instance}, optimizer='adam', metrics=[SequenceSpanAccuracy()])
在这里编译模型,但是一旦训练的第一个纪元开始,这个错误就会浮出水面:
InvalidArgumentError: Expected begin and size arguments to be 1-D tensors of size 3, but got shapes [2] and [2] instead.
[[{{node loss_4/crf_layer_loss/Slice_1}}]]
我正在使用小批量训练模型,这可以解释错误吗?我还注意到,尽管CRF层的参数数量与上面相同,但是我的CRF层的模型摘要没有一个尺寸(比较上面的摘要和下面的摘要中的CRF层规范)。为什么导致这种不匹配,如何解决?
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
embedding_5 (Embedding) (None, 648, 300) 1500000
_________________________________________________________________
bidirectional_5 (Bidirection (None, 648, 1000) 3204000
_________________________________________________________________
crf_layer (CRF) (None, 648) 6054
=================================================================