Question

我一直在尝试通过此处给出的示例来学习BERT多类标签：https://towardsdatascience.com/building-a-multi-label-text-classifier-using-bert-and-tensorflow-f188e0ecdc5d

我能够获得所有与Sigmoid函数（multilabel）一起使用的代码。但是，我想切换到softmax方法（多类）。我已经对多类的一些现有代码取消了注释并注释掉了多标签。但是，在进行矩阵乘法以计算create_model（）函数中的损失时，出现了错误（粘贴在下面）。请让我知道我做错了什么以及应该如何处理多类。

我尝试将行更改为此：per_example_loss = tf.nn.softmax_cross_entropy_with_logits（labels = labels，logits = logits）

这有效，但我不知道这是否会改变计算结果。我想确保模型以正确的方式工作。

def create_model（bert_config，is_training，input_ids，input_mask，segment_ids，标签，num_labels，use_one_hot_embeddings）： “”“创建分类模型。”“” 模型= modelling.BertModel（ config = bert_config， is_training = is_training， input_ids = input_ids， input_mask =输入掩码， token_type_ids = segment_ids， use_one_hot_embeddings = use_one_hot_embeddings）

# In the demo, we are doing a simple classification task on the entire
# segment.
#
# If you want to use the token-level output, use model.get_sequence_output()
# instead.
output_layer = model.get_pooled_output()

hidden_size = output_layer.shape[-1].value

output_weights = tf.get_variable(
    "output_weights", [num_labels, hidden_size],
    initializer=tf.truncated_normal_initializer(stddev=0.02))

output_bias = tf.get_variable(
    "output_bias", [num_labels], initializer=tf.zeros_initializer())

with tf.variable_scope("loss"):
    if is_training:
        # I.e., 0.1 dropout
        output_layer = tf.nn.dropout(output_layer, keep_prob=0.9)

    logits = tf.matmul(output_layer, output_weights, transpose_b=True)
    logits = tf.nn.bias_add(logits, output_bias)

    #MULTILABEL
    # probabilities = tf.nn.softmax(logits, axis=-1) ### multiclass case
    # probabilities = tf.nn.sigmoid(logits)  #### multi-label case
    #
    # labels = tf.cast(labels, tf.float32)
    # tf.logging.info("num_labels:{};logits:{};labels:{}".format(num_labels, logits, labels))
    # per_example_loss = tf.nn.sigmoid_cross_entropy_with_logits(labels=labels, logits=logits)
    # loss = tf.reduce_mean(per_example_loss)

    #MULTICLASS STUFF
    probabilities = tf.nn.softmax(logits, axis=-1)
    log_probs = tf.nn.log_softmax(logits, axis=-1)

    one_hot_labels = tf.one_hot(labels, depth=num_labels, dtype=tf.float32)

    per_example_loss = -tf.reduce_sum(one_hot_labels * log_probs, axis=-1)
    loss = tf.reduce_mean(per_example_loss)

    return (loss, per_example_loss, logits, probabilities)

错误消息

model_fn中的文件“ path / Try.py”，第571行 num_labels，use_one_hot_embeddings）

create_model中的文件“ path / Try.py”，第539行 per_example_loss = -tf.reduce_sum（one_hot_labels * log_probs，axis = -1）

binary_op_wrapper中的第884行的“ path / venv / lib / python3.7 / site-packages / tensorflow / python / ops / math_ops.py”文件返回func（x，y，name = name）

_mul_dispatch中的文件“ path / venv / lib / python3.7 / site-packages / tensorflow / python / ops / math_ops.py”，行1180 返回gen_math_ops.mul（x，y，name = name）

文件“ path / venv / lib / python3.7 / site-packages / tensorflow / python / ops / gen_math_ops.py”，第6490行，以mul为单位 “ Mul”，x = x，y = y，name = name）

_apply_op_helper中的文件“ path / venv / lib / python3.7 / site-packages / tensorflow / python / framework / op_def_library.py”行788 op_def = op_def）

文件“ path / venv / lib / python3.7 / site-packages / tensorflow / python / util / deprecation.py”，第507行，在new_func中返回func（* args，** kwargs）

create_op中的文件“ path / venv / lib / python3.7 / site-packages / tensorflow / python / framework / ops.py”，第3616行 op_def = op_def）

文件“ path / venv / lib / python3.7 / site-packages / tensorflow / python / framework / ops.py”，第2027行， init control_input_ops）

文件“ path / venv / lib / python3.7 / site-packages / tensorflow / python / framework / ops.py”，行1867，在_create_c_op中引发ValueError（str（e））

ValueError：尺寸必须相等，但对于输入形状为[32,6,6]，[32,6]的“损耗/ mul”（op：“ Mul”），尺寸必须为6和32。

tensorflow BERT-在

0 个答案: