在深入了解具体情况之前,请先了解一下背景情况。
我正在构建一个机器学习模型(像编码器那样的解码器),该模型使用gru在编码器端接受句子,并且在解码器中接受通过某些Deep CNN堆栈(例如Inception等)传递的图像,我正在对此进行培训使用COCO数据集对模型进行建模,方法是将目标指定为形状为0的矢量(Batch_size,2048),以了解字幕/图像两种模式之间的相似性。
模型定义
字幕侧(编码器)中的模型
print("loading the caption side")
cap_input = Input(shape=(model_config['max_cap_length'],), dtype='float32', name='cap_input')
X = Masking(mask_value=0,input_shape=(model_config['max_cap_length'], model_config['output_dim']))(X)
X = Embedding(output_dim=model_config['dim_word'], input_dim=len(tokenizer.word_index), input_length=model_config['max_cap_length'])(cap_input)
X = gru(model_config['output_dim'])(X)
emb_cap = Lambda(lambda x: l2norm(x))(X)
图像侧的模型(解码器)
print("loading the image side")
image_input = Input(shape=(64, 2048), name='image_input')
X = Flatten()(image_input)
X = Dense(model_config['output_dim'])(X)
emb_image = Lambda(lambda x: l2norm(x))(X)
合并/连接层
print ("loading the joined model")
merged = concatenate([emb_cap, emb_image])
模型的编译和定义
model = Model(inputs=[cap_input, image_input], outputs=[merged])
model.compile(optimizer=model_config['optimizer'], loss=contrastive_loss)
模型摘要如下:
现在,当我尝试用一批数据拟合模型时,我会遇到此错误。
型号拟合代码
model.fit(a,b,epochs = 1)
其中:
a = [(batchsize,49),(batchsize,64,2048)]
b = (batchsize,2048)
a[0] => The caption-side where 49 is the max length to which all other captions in the dataset are padded.
a[1] => The image side where each image is preprocessed to be of shape (64,2048)
b => The output where 2048 corresponds to the "merged" layer where I am concatenating two tensors of dim (batchsize,1024)
这是错误的完整堆栈跟踪
InvalidArgumentError Traceback (most recent call last)
<ipython-input-147-e50fb5e10e5c> in <module>()
----> 1 model.fit(a,b,epochs = 1)
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training.py in fit(self, x, y, batch_size, epochs, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, initial_epoch, steps_per_epoch, validation_steps, max_queue_size, workers, use_multiprocessing, **kwargs)
878 initial_epoch=initial_epoch,
879 steps_per_epoch=steps_per_epoch,
--> 880 validation_steps=validation_steps)
881
882 def evaluate(self,
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training_arrays.py in model_iteration(model, inputs, targets, sample_weights, batch_size, epochs, verbose, callbacks, val_inputs, val_targets, val_sample_weights, shuffle, initial_epoch, steps_per_epoch, validation_steps, mode, validation_in_fit, **kwargs)
308 if ins and isinstance(ins[-1], int):
309 # Do not slice the training phase flag.
--> 310 ins_batch = slice_arrays(ins[:-1], batch_ids) + [ins[-1]]
311 else:
312 ins_batch = slice_arrays(ins, batch_ids)
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/utils/generic_utils.py in slice_arrays(arrays, start, stop)
524 if hasattr(start, 'shape'):
525 start = start.tolist()
--> 526 return [None if x is None else x[start] for x in arrays]
527 else:
528 return [None if x is None else x[start:stop] for x in arrays]
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/utils/generic_utils.py in <listcomp>(.0)
524 if hasattr(start, 'shape'):
525 start = start.tolist()
--> 526 return [None if x is None else x[start] for x in arrays]
527 else:
528 return [None if x is None else x[start:stop] for x in arrays]
/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/array_ops.py in _slice_helper(tensor, slice_spec, var)
652 ellipsis_mask=ellipsis_mask,
653 var=var,
--> 654 name=name)
655
656
/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/array_ops.py in strided_slice(input_, begin, end, strides, begin_mask, end_mask, ellipsis_mask, new_axis_mask, shrink_axis_mask, var, name)
818 ellipsis_mask=ellipsis_mask,
819 new_axis_mask=new_axis_mask,
--> 820 shrink_axis_mask=shrink_axis_mask)
821
822 parent_name = name
/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/gen_array_ops.py in strided_slice(input, begin, end, strides, begin_mask, end_mask, ellipsis_mask, new_axis_mask, shrink_axis_mask, name)
9332 else:
9333 message = e.message
-> 9334 _six.raise_from(_core._status_to_exception(e.code, message), None)
9335 # Add nodes to the TensorFlow graph.
9336 if begin_mask is None:
/usr/local/lib/python3.6/dist-packages/six.py in raise_from(value, from_value)
InvalidArgumentError: Index out of range using input dim 2; input has only 2 dims [Op:StridedSlice] name: strided_slice/
那么关于为什么发生这种情况有什么想法吗?还可以有人解释错误的含义吗?