keras,不兼容的形状:[64]与[64,280]

时间:2018-11-10 06:28:44

标签: keras nlp embedding

我正在做一个简单的二进制分类器,但是出了点问题,为什么? 注意:我已经覆盖了支持mask_zero的相关层 详细信息如下...  我的模型如下:

def mymodel_v1(conf_dt):
    input_ = Input(shape=(conf_dt['sent_len'],), dtype='int32')
    embedding = Embedding(input_dim=conf_dt['vocab_size']+1, output_dim=conf_dt['embed_size'], input_length=conf_dt['sent_len'], mask_zero=True)(input_)
    flat = FlattenWithMasking()(embedding)
    dropout = Dropout(rate=conf_dt['dropout'])(flat)
    dense = Dense(300)(dropout)
    out = Dense(1, activation='sigmoid')(dense)

    model = Model(inputs=input_, outputs=out)
    print model.summary()
    return model

和火车步骤是:

def train(train_fp, test_fp, conf_dt):
    train_X, train_y, test_X, test_y, _ = load_data(train_fp, test_fp)
    train_X = pad_sequences(train_X, maxlen=conf_dt['sent_len'], padding='post', truncating='post')
    test_X = pad_sequences(test_X, maxlen=conf_dt['sent_len'], padding='post', truncating='post')
    train_y = np.array(train_y, ndmin=2)
    test_y = np.array(test_y, ndmin=2)
    print 'data load and preprocess done'
    print 'train_X.shape: ', train_X.shape
    print 'train_y.shape: ', train_y.shape
    sys.stdout.flush()

    model = mymodel_v1(conf_dt)
    model.compile(optimizer='rmsprop', loss='binary_crossentropy')

    model.fit(train_X, train_y, batch_size=64, nb_epoch=2, verbose=2)
    print model.summary()

    model.evaluate(test_X, test_y)

config.dict是:

conf_dt = {'vocab_size': 200000, 'dropout': 0.3, 'sent_len': 280, 'embed_size': 50}

shape和model.summary的输出是:

train_X.shape:  (116389, 280)
train_y.shape:  (116389, 1)
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_1 (InputLayer)         (None, 280)               0         
_________________________________________________________________
embedding_1 (Embedding)      (None, 280, 50)           10000050  
_________________________________________________________________
flatten_with_masking_1 (Flat (None, 14000)             0         
_________________________________________________________________
dropout_1 (Dropout)          (None, 14000)             0         
_________________________________________________________________
dense_1 (Dense)              (None, 300)               4200300   
_________________________________________________________________
dense_2 (Dense)              (None, 1)                 301       
=================================================================
Total params: 14,200,651
Trainable params: 14,200,651
Non-trainable params: 0

回溯如下:

      File "train.py", line 31, in train
    model.fit(train_X, train_y, batch_size=64, nb_epoch=2, verbose=2)
  File "/home/homework/.jumbo/lib/python2.7/site-packages/keras/engine/training.py", line 1598, in fit
    validation_steps=validation_steps)
  File "/home/homework/.jumbo/lib/python2.7/site-packages/keras/engine/training.py", line 1183, in _fit_loop
    outs = f(ins_batch)
  File "/home/homework/.jumbo/lib/python2.7/site-packages/keras/backend/tensorflow_backend.py", line 2273, in __call__
    **self.session_kwargs)
  File "/home/homework/.jumbo/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 895, in run
    run_metadata_ptr)
  File "/home/homework/.jumbo/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1124, in _run
    feed_dict_tensor, options, run_metadata)
  File "/home/homework/.jumbo/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1321, in _do_run
    options, run_metadata)
  File "/home/homework/.jumbo/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1340, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Incompatible shapes: [64] vs. [64,280]
         [[Node: loss/dense_2_loss/mul = Mul[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"](loss/dense_2_loss/Mean, loss/dense_2_loss/Cast)]]

Caused by op u'loss/dense_2_loss/mul', defined at:
  File "train.py", line 49, in <module>
    train(train_pt, test_pt, conf_dt)
  File "train.py", line 29, in train
    model.compile(optimizer='rmsprop', loss='binary_crossentropy')
  File "/home/homework/.jumbo/lib/python2.7/site-packages/keras/engine/training.py", line 850, in compile
    sample_weight, mask)
  File "/home/homework/.jumbo/lib/python2.7/site-packages/keras/engine/training.py", line 455, in weighted
    score_array *= mask
  File "/home/homework/.jumbo/lib/python2.7/site-packages/tensorflow/python/ops/math_ops.py", line 865, in binary_op_wrapper
    return func(x, y, name=name)
  File "/home/homework/.jumbo/lib/python2.7/site-packages/tensorflow/python/ops/math_ops.py", line 1088, in _mul_dispatch
    return gen_math_ops._mul(x, y, name=name)
  File "/home/homework/.jumbo/lib/python2.7/site-packages/tensorflow/python/ops/gen_math_ops.py", line 1449, in _mul
    result = _op_def_lib.apply_op("Mul", x=x, y=y, name=name)
  File "/home/homework/.jumbo/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 767, in apply_op
    op_def=op_def)
  File "/home/homework/.jumbo/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 2630, in create_op
    original_op=self._default_original_op, op_def=op_def)
  File "/home/homework/.jumbo/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1204, in __init__
    self._traceback = self._graph._extract_stack()  # pylint: disable=protected-access

InvalidArgumentError (see above for traceback): Incompatible shapes: [64] vs. [64,280]
         [[Node: loss/dense_2_loss/mul = Mul[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"](loss/dense_2_loss/Mean, loss/dense_2_loss/Cast)]]

为什么会这样?有人可以帮我吗? ty。

2 个答案:

答案 0 :(得分:0)

我阅读了一些keras源代码,默认情况下<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/" xmlns:adl="http://www.haulmont.com/sherlock/adler"> <soapenv:Header/> <soapenv:Body> <adl:QuotationRequest> <adl:sessionId>?</adl:sessionId> <!--Optional:--> <adl:retailSessionId>?</adl:retailSessionId> <adl:job> <adl:accountNumber>?</adl:accountNumber> <!--Optional:--> <adl:pin>?</adl:pin> <!--Optional:--> <adl:caller> <!--Optional:--> <adl:individualId>?</adl:individualId> <adl:name>?</adl:name> </adl:caller> <adl:asap>?</adl:asap> <adl:serviceCode>?</adl:serviceCode> <!--Optional:--> <adl:jobDate>?</adl:jobDate> <adl:paymentType>?</adl:paymentType> <!--Optional:--> <adl:numberOfPassengers>?</adl:numberOfPassengers> <!--Optional:--> <adl:deadline>?</adl:deadline> <adl:routeInfo> <adl:destinationUnknown>?</adl:destinationUnknown> <adl:waitAndReturn>?</adl:waitAndReturn> <adl:asDirected>?</adl:asDirected> <!--Optional:--> <adl:asDirectedHours>?</adl:asDirectedHours> </adl:routeInfo> <!--Optional:--> <adl:promoCode>?</adl:promoCode> <!--Optional:--> <adl:creditCard> <!--Optional:--> <adl:id>?</adl:id> <!--Optional:--> <adl:startDate>?</adl:startDate> <!--Optional:--> <adl:expiryDate>?</adl:expiryDate> <!--Optional:--> <adl:holderName>?</adl:holderName> <!--Optional:--> <adl:number>?</adl:number> <!--Optional:--> <adl:cvcNumber>?</adl:cvcNumber> <!--Optional:--> <adl:issueNumber>?</adl:issueNumber> <!--Optional:--> <adl:billingAddress> <!--Optional:--> <adl:country>?</adl:country> <!--Optional:--> <adl:street>?</adl:street> <!--Optional:--> <adl:streetNumber>?</adl:streetNumber> <!--Optional:--> <adl:town>?</adl:town> <!--Optional:--> <adl:postcode>?</adl:postcode> </adl:billingAddress> <!--Optional:--> <adl:encryptedDetails>?</adl:encryptedDetails> </adl:creditCard> <!--Zero or more repetitions:--> <adl:actors> <!--Optional:--> <adl:individualId>?</adl:individualId> <!--Optional:--> <adl:name>?</adl:name> <!--Optional:--> <adl:telephone>?</adl:telephone> <!--Optional:--> <adl:email>?</adl:email> <adl:type>?</adl:type> </adl:actors> <!--Zero or more repetitions:--> <adl:references> <adl:entityName>?</adl:entityName> <!--Optional:--> <adl:value>?</adl:value> </adl:references> <!--Zero or more repetitions:--> <adl:specialInstructions> <adl:typeCode>?</adl:typeCode> <!--Optional:--> <adl:value>?</adl:value> </adl:specialInstructions> <!--Zero or more repetitions:--> <adl:stops> <adl:operationType>?</adl:operationType> <adl:address> <adl:formattedAddress>?</adl:formattedAddress> <adl:latitude>?</adl:latitude> <adl:longitude>?</adl:longitude> <!--Optional:--> <adl:field0>?</adl:field0> <!--Optional:--> <adl:field1>?</adl:field1> <!--Optional:--> <adl:field2>?</adl:field2> <!--Optional:--> <adl:field3>?</adl:field3> <!--Optional:--> <adl:field4>?</adl:field4> <!--Optional:--> <adl:field5>?</adl:field5> <!--Optional:--> <adl:field6>?</adl:field6> <!--Optional:--> <adl:field7>?</adl:field7> <!--Optional:--> <adl:field8>?</adl:field8> <!--Optional:--> <adl:field9>?</adl:field9> <!--Optional:--> <adl:field10>?</adl:field10> <!--Optional:--> <adl:field11>?</adl:field11> </adl:address> <!--Optional:--> <adl:contact> <!--Optional:--> <adl:individualId>?</adl:individualId> <!--Optional:--> <adl:name>?</adl:name> <!--Optional:--> <adl:telephone>?</adl:telephone> <!--Optional:--> <adl:email>?</adl:email> </adl:contact> <!--Optional:--> <adl:airportDetails> <adl:airportCode>?</adl:airportCode> <!--Optional:--> <adl:terminalCode>?</adl:terminalCode> <!--Optional:--> <adl:flightNumber>?</adl:flightNumber> <adl:eta>?</adl:eta> <!--Optional:--> <adl:pickupTimeDelay>?</adl:pickupTimeDelay> <!--Optional:--> <adl:airline>?</adl:airline> <!--Optional:--> <adl:arrivalFrom>?</adl:arrivalFrom> <!--Optional:--> <adl:meetingPoint>?</adl:meetingPoint> </adl:airportDetails> <!--Optional:--> <adl:trainDetails> <adl:trainStation>?</adl:trainStation> <!--Optional:--> <adl:meetingPoint>?</adl:meetingPoint> <!--Optional:--> <adl:pickupTimeDelay>?</adl:pickupTimeDelay> <adl:eta>?</adl:eta> <!--Optional:--> <adl:arrivalFrom>?</adl:arrivalFrom> <!--Optional:--> <adl:trainNumber>?</adl:trainNumber> </adl:trainDetails> <!--Optional:--> <adl:notes>?</adl:notes> </adl:stops> </adl:job> </adl:QuotationRequest> </soapenv:Body> </soapenv:Envelope> 的{​​{1}}和Dense层为Dropout,下面是我覆盖的support the masking

FlattenWithMasking

答案 1 :(得分:0)

首先,为什么您的density2层仅输出形状1?对于我来说,这听起来很奇怪,因为模型级别的输出具有置信度,因此输出至少为2。

不适合[280,64]的形状64,至少应该有[1,64]之类的东西,才能使用:

 tf.convert_to_tensor()

(您应该将矩阵相乘,而不是将变量与矩阵相乘):