tensorflow.python.framework.errors_impl.InvalidArgumentError:不兼容的形状:[3]与[8]?

时间:2019-07-29 14:48:53

标签: tensorflow nlp

我使用BERT进行二进制分类,批大小为8,但是当我计算损失值时,总是出现以下错误:

  

回溯(最近通话最近):文件   “ C:\ Users \ Meiwei \ AppData \ Local \ Programs \ Python \ Python37 \ lib \ site-packages \ tensorflow \ python \ client \ session.py”,   1356行,_do_call中       返回fn(* args)文件“ C:\ Users \ Meiwei \ AppData \ Local \ Programs \ Python \ Python37 \ lib \ site-packages \ tensorflow \ python \ client \ session.py”,   _run_fn中的第1341行       选项,feed_dict,fetch_list,target_list,run_metadata)文件“ C:\ Users \ Meiwei \ AppData \ Local \ Programs \ Python \ Python37 \ lib \ site-packages \ tensorflow \ python \ client \ session.py”,   第1429行,在_call_tf_sessionrun中       run_metadata)tensorflow.python.framework.errors_impl.InvalidArgumentError:   不兼容的形状:[3]与[8] [[{{node   渐变/ sub_grad / BroadcastGradientArgs}}]

     

在处理上述异常期间,发生了另一个异常:

     

回溯(最近通话最近):文件   “ E:/project_chris/aad_bert_version/run.py”,第81行,在       input_y:y_train})文件“ C:\ Users \ Meiwei \ AppData \ Local \ Programs \ Python \ Python37 \ lib \ site-packages \ tensorflow \ python \ client \ session.py”,   950行,运行中       run_metadata_ptr)文件“ C:\ Users \ Meiwei \ AppData \ Local \ Programs \ Python \ Python37 \ lib \ site-packages \ tensorflow \ python \ client \ session.py”,   第1173行,_run       feed_dict_tensor,选项,run_metadata)文件“ C:\ Users \ Meiwei \ AppData \ Local \ Programs \ Python \ Python37 \ lib \ site-packages \ tensorflow \ python \ client \ session.py”,   第1350行,在_do_run       run_metadata)文件“ C:\ Users \ Meiwei \ AppData \ Local \ Programs \ Python \ Python37 \ lib \ site-packages \ tensorflow \ python \ client \ session.py”,   1370行,_do_call       提高type(e)(node_def,op,message)tensorflow.python.framework.errors_impl.InvalidArgumentError:   不兼容的形状:[3]与[8] [[节点   渐变/ sub_grad / BroadcastGradientArgs(定义为   E:/project_chris/aad_bert_version/run.py:57)]]

     

“ gradients / sub_grad / BroadcastGradientArgs”的原始堆栈跟踪:
  在第57行的文件“ E:/project_chris/aad_bert_version/run.py”中       train_op = tf.train.AdamOptimizer(lr).minimize(loss)文件“ C:\ Users \ Meiwei \ AppData \ Local \ Programs \ Python \ Python37 \ lib \ site-packages \ tensorflow \ python \ training \ optimizer.py” ,   第403行,最小化       grad_loss = grad_loss)文件“ C:\ Users \ Meiwei \ AppData \ Local \ Programs \ Python \ Python37 \ lib \ site-packages \ tensorflow \ python \ training \ optimizer.py”,   第512行,在compute_gradients中       colocate_gradients_with_ops = colocate_gradients_with_ops)文件“ C:\ Users \ Meiwei \ AppData \ Local \ Programs \ Python \ Python37 \ lib \ site-packages \ tensorflow \ python \ ops \ gradients_impl.py”,   158行,渐变       unconnected_gradients)文件“ C:\ Users \ Meiwei \ AppData \ Local \ Programs \ Python \ Python37 \ lib \ site-packages \ tensorflow \ python \ ops \ gradients_util.py”,   _GradientsHelper中的第731行       lambda:grad_fn(op,* out_grads))文件“ C:\ Users \ Meiwei \ AppData \ Local \ Programs \ Python \ Python37 \ lib \ site-packages \ tensorflow \ python \ ops \ gradients_util.py”,   _MaybeCompile中的第403行       return grad_fn()#退出早期文件“ C:\ Users \ Meiwei \ AppData \ Local \ Programs \ Python \ Python37 \ lib \ site-packages \ tensorflow \ python \ ops \ gradients_util.py”,   731行,在       lambda:grad_fn(op,* out_grads))文件“ C:\ Users \ Meiwei \ AppData \ Local \ Programs \ Python \ Python37 \ lib \ site-packages \ tensorflow \ python \ ops \ math_grad.py”,   _SubGrad中的第1027行       rx,ry = gen_array_ops.broadcast_gradient_args(sx,sy)文件“ C:\ Users \ Meiwei \ AppData \ Local \ Programs \ Python \ Python37 \ lib \ site-packages \ tensorflow \ python \ ops \ gen_array_ops.py”,   第1004行,在broadcast_gradient_args中       “ BroadcastGradientArgs”,s0 = s0,s1 = s1,名称=名称)文件“ C:\ Users \ Meiwei \ AppData \ Local \ Programs \ Python \ Python37 \ lib \ site-packages \ tensorflow \ python \ framework \ op_def_library.py ”,   _apply_op_helper中的第788行       op_def = op_def)文件“ C:\ Users \ Meiwei \ AppData \ Local \ Programs \ Python \ Python37 \ lib \ site-packages \ tensorflow \ python \ util \ deprecation.py”,   第507行,在new_func中       返回func(* args,** kwargs)文件“ C:\ Users \ Meiwei \ AppData \ Local \ Programs \ Python \ Python37 \ lib \ site-packages \ tensorflow \ python \ framework \ ops.py”,   第3616行,位于create_op中       op_def = op_def)文件“ C:\ Users \ Meiwei \ AppData \ Local \ Programs \ Python \ Python37 \ lib \ site-packages \ tensorflow \ python \ framework \ ops.py”,   第2005行, init       self._traceback = tf_stack.extract_stack()

     

...最初创建为op'sub',定义在:File   “ E:/project_chris/aad_bert_version/run.py”,第56行       损失= tf.reduce_mean(tf.square(tf.reshape(pred,[-1])-tf.reshape(input_y,[-1])))文件   “ C:\ Users \ Meiwei \ AppData \ Local \ Programs \ Python \ Python37 \ lib \ site-packages \ tensorflow \ python \ ops \ math_ops.py”,   第884行,在binary_op_wrapper中       返回func(x,y,name = name)文件“ C:\ Users \ Meiwei \ AppData \ Local \ Programs \ Python \ Python37 \ lib \ site-packages \ tensorflow \ python \ ops \ gen_math_ops.py”,   子行中的11574行       “ Sub”,x = x,y = y,名称=名称)文件“ C:\ Users \ Meiwei \ AppData \ Local \ Programs \ Python \ Python37 \ lib \ site-packages \ tensorflow \ python \ framework \ op_def_library.py ”,   _apply_op_helper中的第788行       op_def = op_def)文件“ C:\ Users \ Meiwei \ AppData \ Local \ Programs \ Python \ Python37 \ lib \ site-packages \ tensorflow \ python \ util \ deprecation.py”,   第507行,在new_func中       返回func(* args,** kwargs)文件“ C:\ Users \ Meiwei \ AppData \ Local \ Programs \ Python \ Python37 \ lib \ site-packages \ tensorflow \ python \ framework \ ops.py”,   第3616行,位于create_op中       op_def = op_def)文件“ C:\ Users \ Meiwei \ AppData \ Local \ Programs \ Python \ Python37 \ lib \ site-packages \ tensorflow \ python \ framework \ ops.py”,   第2005行, init       self._traceback = tf_stack.extract_stack()

lr = 0.0006  # 学习率
# 配置文件
data_root = './bert_model_chinese'
bert_config_file = os.path.join(data_root, 'bert_config.json')
bert_config = modeling.BertConfig.from_json_file(bert_config_file)
init_checkpoint = os.path.join(data_root, 'bert_model.ckpt')
bert_vocab_file = os.path.join(data_root, 'vocab.txt')
token = tokenization.CharTokenizer(vocab_file=bert_vocab_file)

input_ids = tf.placeholder(tf.int32, shape=[None, None], name='input_ids')
input_mask = tf.placeholder(tf.int32, shape=[None, None], name='input_masks')
segment_ids = tf.placeholder(tf.int32, shape=[None, None], name='segment_ids')
input_y = tf.placeholder(tf.float32, shape=[None, 1], name="input_y")
weights = {
    'out': tf.Variable(tf.random_normal([768, 1]))
}
biases = {
    'out': tf.Variable(tf.constant(0.1, shape=[1, ]))
}

model = modeling.BertModel(
    config=bert_config,
    is_training=False,
    input_ids=input_ids,
    input_mask=input_mask,
    token_type_ids=segment_ids,
    use_one_hot_embeddings=False)

tvars = tf.trainable_variables()
(assignment, initialized_variable_names) = modeling.get_assignment_map_from_checkpoint(tvars, init_checkpoint)
tf.train.init_from_checkpoint(init_checkpoint, assignment)
output_layer_pooled = model.get_pooled_output()  # 这个获取句子的output
output_layer_pooled = tf.nn.dropout(output_layer_pooled, keep_prob=0.9)

w_out = weights['out']
b_out = biases['out']
pred = tf.add(tf.matmul(output_layer_pooled, w_out), b_out, name="pre1")
pred = tf.reshape(pred, shape=[-1, 1], name="pre")

loss = tf.reduce_mean(tf.square(tf.reshape(pred, [-1]) - tf.reshape(input_y, [-1])))
train_op = tf.train.AdamOptimizer(lr).minimize(loss)



EPOCHS = 5
max_sentence_length = 512
batch_size = 8
data_path = './data'
train_input,predict_input =fffffuck(data_path,bert_vocab_file,True,True,
                                               './temp',max_sentence_length,batch_size,batch_size,batch_size)
data_loader = TextLoader(train_input,batch_size)
with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    for i in range(EPOCHS):
        data_loader.shuff()
        for j in range(data_loader.num_batches):
            x_train, y_train = data_loader.next_batch(j)
            print(y_train)
            print(y_train.shape)
            x_input_ids = x_train[0]
            x_input_mask = x_train[1]
            x_segment_ids = x_train[2]
            loss_, _ = sess.run([loss, train_op],
                                feed_dict={input_ids: x_input_ids, input_mask: x_input_mask, segment_ids: x_segment_ids,
                                           input_y: y_train})
            print('loss:', loss_)


class TextLoader(object):
    def __init__(self, dataSet,batch_size):
        self.data = dataSet
        self.batch_size = batch_size
        self.shuff()

    def shuff(self):
        self.num_batches = int(len(self.data) // self.batch_size)
        if self.num_batches == 0:
            assert False, 'Not enough data, make batch_size small.'
        np.random.shuffle(self.data)

    def next_batch(self,k):
        x = []
        y = []
        for i in range(self.batch_size):
            tmp = list(self.data)[k*self.batch_size + i][:3]
            x.append(tmp)
            y_ = list(self.data)[k*self.batch_size + i][3]
            y.append(y_)
        x = np.array(x)
        return x,np.array(y).reshape([self.batch_size,1])

0 个答案:

没有答案