我正在做一个简单的二进制分类器,但是出了点问题,为什么? 注意:我已经覆盖了支持mask_zero的相关层 详细信息如下... 我的模型如下:
def mymodel_v1(conf_dt):
input_ = Input(shape=(conf_dt['sent_len'],), dtype='int32')
embedding = Embedding(input_dim=conf_dt['vocab_size']+1, output_dim=conf_dt['embed_size'], input_length=conf_dt['sent_len'], mask_zero=True)(input_)
flat = FlattenWithMasking()(embedding)
dropout = Dropout(rate=conf_dt['dropout'])(flat)
dense = Dense(300)(dropout)
out = Dense(1, activation='sigmoid')(dense)
model = Model(inputs=input_, outputs=out)
print model.summary()
return model
和火车步骤是:
def train(train_fp, test_fp, conf_dt):
train_X, train_y, test_X, test_y, _ = load_data(train_fp, test_fp)
train_X = pad_sequences(train_X, maxlen=conf_dt['sent_len'], padding='post', truncating='post')
test_X = pad_sequences(test_X, maxlen=conf_dt['sent_len'], padding='post', truncating='post')
train_y = np.array(train_y, ndmin=2)
test_y = np.array(test_y, ndmin=2)
print 'data load and preprocess done'
print 'train_X.shape: ', train_X.shape
print 'train_y.shape: ', train_y.shape
sys.stdout.flush()
model = mymodel_v1(conf_dt)
model.compile(optimizer='rmsprop', loss='binary_crossentropy')
model.fit(train_X, train_y, batch_size=64, nb_epoch=2, verbose=2)
print model.summary()
model.evaluate(test_X, test_y)
config.dict是:
conf_dt = {'vocab_size': 200000, 'dropout': 0.3, 'sent_len': 280, 'embed_size': 50}
shape和model.summary的输出是:
train_X.shape: (116389, 280)
train_y.shape: (116389, 1)
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) (None, 280) 0
_________________________________________________________________
embedding_1 (Embedding) (None, 280, 50) 10000050
_________________________________________________________________
flatten_with_masking_1 (Flat (None, 14000) 0
_________________________________________________________________
dropout_1 (Dropout) (None, 14000) 0
_________________________________________________________________
dense_1 (Dense) (None, 300) 4200300
_________________________________________________________________
dense_2 (Dense) (None, 1) 301
=================================================================
Total params: 14,200,651
Trainable params: 14,200,651
Non-trainable params: 0
回溯如下:
File "train.py", line 31, in train
model.fit(train_X, train_y, batch_size=64, nb_epoch=2, verbose=2)
File "/home/homework/.jumbo/lib/python2.7/site-packages/keras/engine/training.py", line 1598, in fit
validation_steps=validation_steps)
File "/home/homework/.jumbo/lib/python2.7/site-packages/keras/engine/training.py", line 1183, in _fit_loop
outs = f(ins_batch)
File "/home/homework/.jumbo/lib/python2.7/site-packages/keras/backend/tensorflow_backend.py", line 2273, in __call__
**self.session_kwargs)
File "/home/homework/.jumbo/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 895, in run
run_metadata_ptr)
File "/home/homework/.jumbo/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1124, in _run
feed_dict_tensor, options, run_metadata)
File "/home/homework/.jumbo/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1321, in _do_run
options, run_metadata)
File "/home/homework/.jumbo/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1340, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Incompatible shapes: [64] vs. [64,280]
[[Node: loss/dense_2_loss/mul = Mul[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"](loss/dense_2_loss/Mean, loss/dense_2_loss/Cast)]]
Caused by op u'loss/dense_2_loss/mul', defined at:
File "train.py", line 49, in <module>
train(train_pt, test_pt, conf_dt)
File "train.py", line 29, in train
model.compile(optimizer='rmsprop', loss='binary_crossentropy')
File "/home/homework/.jumbo/lib/python2.7/site-packages/keras/engine/training.py", line 850, in compile
sample_weight, mask)
File "/home/homework/.jumbo/lib/python2.7/site-packages/keras/engine/training.py", line 455, in weighted
score_array *= mask
File "/home/homework/.jumbo/lib/python2.7/site-packages/tensorflow/python/ops/math_ops.py", line 865, in binary_op_wrapper
return func(x, y, name=name)
File "/home/homework/.jumbo/lib/python2.7/site-packages/tensorflow/python/ops/math_ops.py", line 1088, in _mul_dispatch
return gen_math_ops._mul(x, y, name=name)
File "/home/homework/.jumbo/lib/python2.7/site-packages/tensorflow/python/ops/gen_math_ops.py", line 1449, in _mul
result = _op_def_lib.apply_op("Mul", x=x, y=y, name=name)
File "/home/homework/.jumbo/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 767, in apply_op
op_def=op_def)
File "/home/homework/.jumbo/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 2630, in create_op
original_op=self._default_original_op, op_def=op_def)
File "/home/homework/.jumbo/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1204, in __init__
self._traceback = self._graph._extract_stack() # pylint: disable=protected-access
InvalidArgumentError (see above for traceback): Incompatible shapes: [64] vs. [64,280]
[[Node: loss/dense_2_loss/mul = Mul[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"](loss/dense_2_loss/Mean, loss/dense_2_loss/Cast)]]
为什么会这样?有人可以帮我吗? ty。
答案 0 :(得分:0)
我阅读了一些keras源代码,默认情况下<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/" xmlns:adl="http://www.haulmont.com/sherlock/adler">
<soapenv:Header/>
<soapenv:Body>
<adl:QuotationRequest>
<adl:sessionId>?</adl:sessionId>
<!--Optional:-->
<adl:retailSessionId>?</adl:retailSessionId>
<adl:job>
<adl:accountNumber>?</adl:accountNumber>
<!--Optional:-->
<adl:pin>?</adl:pin>
<!--Optional:-->
<adl:caller>
<!--Optional:-->
<adl:individualId>?</adl:individualId>
<adl:name>?</adl:name>
</adl:caller>
<adl:asap>?</adl:asap>
<adl:serviceCode>?</adl:serviceCode>
<!--Optional:-->
<adl:jobDate>?</adl:jobDate>
<adl:paymentType>?</adl:paymentType>
<!--Optional:-->
<adl:numberOfPassengers>?</adl:numberOfPassengers>
<!--Optional:-->
<adl:deadline>?</adl:deadline>
<adl:routeInfo>
<adl:destinationUnknown>?</adl:destinationUnknown>
<adl:waitAndReturn>?</adl:waitAndReturn>
<adl:asDirected>?</adl:asDirected>
<!--Optional:-->
<adl:asDirectedHours>?</adl:asDirectedHours>
</adl:routeInfo>
<!--Optional:-->
<adl:promoCode>?</adl:promoCode>
<!--Optional:-->
<adl:creditCard>
<!--Optional:-->
<adl:id>?</adl:id>
<!--Optional:-->
<adl:startDate>?</adl:startDate>
<!--Optional:-->
<adl:expiryDate>?</adl:expiryDate>
<!--Optional:-->
<adl:holderName>?</adl:holderName>
<!--Optional:-->
<adl:number>?</adl:number>
<!--Optional:-->
<adl:cvcNumber>?</adl:cvcNumber>
<!--Optional:-->
<adl:issueNumber>?</adl:issueNumber>
<!--Optional:-->
<adl:billingAddress>
<!--Optional:-->
<adl:country>?</adl:country>
<!--Optional:-->
<adl:street>?</adl:street>
<!--Optional:-->
<adl:streetNumber>?</adl:streetNumber>
<!--Optional:-->
<adl:town>?</adl:town>
<!--Optional:-->
<adl:postcode>?</adl:postcode>
</adl:billingAddress>
<!--Optional:-->
<adl:encryptedDetails>?</adl:encryptedDetails>
</adl:creditCard>
<!--Zero or more repetitions:-->
<adl:actors>
<!--Optional:-->
<adl:individualId>?</adl:individualId>
<!--Optional:-->
<adl:name>?</adl:name>
<!--Optional:-->
<adl:telephone>?</adl:telephone>
<!--Optional:-->
<adl:email>?</adl:email>
<adl:type>?</adl:type>
</adl:actors>
<!--Zero or more repetitions:-->
<adl:references>
<adl:entityName>?</adl:entityName>
<!--Optional:-->
<adl:value>?</adl:value>
</adl:references>
<!--Zero or more repetitions:-->
<adl:specialInstructions>
<adl:typeCode>?</adl:typeCode>
<!--Optional:-->
<adl:value>?</adl:value>
</adl:specialInstructions>
<!--Zero or more repetitions:-->
<adl:stops>
<adl:operationType>?</adl:operationType>
<adl:address>
<adl:formattedAddress>?</adl:formattedAddress>
<adl:latitude>?</adl:latitude>
<adl:longitude>?</adl:longitude>
<!--Optional:-->
<adl:field0>?</adl:field0>
<!--Optional:-->
<adl:field1>?</adl:field1>
<!--Optional:-->
<adl:field2>?</adl:field2>
<!--Optional:-->
<adl:field3>?</adl:field3>
<!--Optional:-->
<adl:field4>?</adl:field4>
<!--Optional:-->
<adl:field5>?</adl:field5>
<!--Optional:-->
<adl:field6>?</adl:field6>
<!--Optional:-->
<adl:field7>?</adl:field7>
<!--Optional:-->
<adl:field8>?</adl:field8>
<!--Optional:-->
<adl:field9>?</adl:field9>
<!--Optional:-->
<adl:field10>?</adl:field10>
<!--Optional:-->
<adl:field11>?</adl:field11>
</adl:address>
<!--Optional:-->
<adl:contact>
<!--Optional:-->
<adl:individualId>?</adl:individualId>
<!--Optional:-->
<adl:name>?</adl:name>
<!--Optional:-->
<adl:telephone>?</adl:telephone>
<!--Optional:-->
<adl:email>?</adl:email>
</adl:contact>
<!--Optional:-->
<adl:airportDetails>
<adl:airportCode>?</adl:airportCode>
<!--Optional:-->
<adl:terminalCode>?</adl:terminalCode>
<!--Optional:-->
<adl:flightNumber>?</adl:flightNumber>
<adl:eta>?</adl:eta>
<!--Optional:-->
<adl:pickupTimeDelay>?</adl:pickupTimeDelay>
<!--Optional:-->
<adl:airline>?</adl:airline>
<!--Optional:-->
<adl:arrivalFrom>?</adl:arrivalFrom>
<!--Optional:-->
<adl:meetingPoint>?</adl:meetingPoint>
</adl:airportDetails>
<!--Optional:-->
<adl:trainDetails>
<adl:trainStation>?</adl:trainStation>
<!--Optional:-->
<adl:meetingPoint>?</adl:meetingPoint>
<!--Optional:-->
<adl:pickupTimeDelay>?</adl:pickupTimeDelay>
<adl:eta>?</adl:eta>
<!--Optional:-->
<adl:arrivalFrom>?</adl:arrivalFrom>
<!--Optional:-->
<adl:trainNumber>?</adl:trainNumber>
</adl:trainDetails>
<!--Optional:-->
<adl:notes>?</adl:notes>
</adl:stops>
</adl:job>
</adl:QuotationRequest>
</soapenv:Body>
</soapenv:Envelope>
的{{1}}和Dense
层为Dropout
,下面是我覆盖的support the masking
:
FlattenWithMasking
答案 1 :(得分:0)
首先,为什么您的density2层仅输出形状1?对于我来说,这听起来很奇怪,因为模型级别的输出具有置信度,因此输出至少为2。
不适合[280,64]的形状64,至少应该有[1,64]之类的东西,才能使用:
tf.convert_to_tensor()
(您应该将矩阵相乘,而不是将变量与矩阵相乘):