我正在用LSTM训练Tensorflow模型以进行预测性维护。对于每个实例,我创建一个矩阵(50,4),其中50是历史序列的长度,而4是每个记录的特征数,因此为了训练模型,我使用例如(55048,50,4)张量和(55048,1)作为标签。当我在计算机上使用Jupyter进行训练时,它可以运行(非常慢,但是可以),但是在Colab上,我会收到此错误:
Exception has occured: ConnectionRefusedError [Errno 111] Connection refused
我与您分享一些代码。我知道这很长:
Training data shape is (55048, 50, 4)
Labels shape is (55048, 1)
WARNING:tensorflow:Layer lstm will not use cuDNN kernel since it doesn't meet the cuDNN kernel criteria. It will use generic GPU kernel as fallback when running on GPU
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
lstm (LSTM) (None, 50, 100) 42000
_________________________________________________________________
dense (Dense) (None, 50, 1) 101
=================================================================
Total params: 42,101
Trainable params: 42,101
Non-trainable params: 0
_________________________________________________________________
Epoch 1/50
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/resource_variable_ops.py:1817: calling BaseResourceVariable.__init__ (from tensorflow.python.ops.resource_variable_ops) with constraint is deprecated and will be removed in a future version.
Instructions for updating:
If using Keras pass *_constraint arguments to layers.
ValueError: in user code:
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training.py:571 train_function *
outputs = self.distribute_strategy.run(
/usr/local/lib/python3.6/dist-packages/tensorflow/python/distribute/distribute_lib.py:951 run **
return self._extended.call_for_each_replica(fn, args=args, kwargs=kwargs)
/usr/local/lib/python3.6/dist-packages/tensorflow/python/distribute/distribute_lib.py:2290 call_for_each_replica
return self._call_for_each_replica(fn, args, kwargs)
/usr/local/lib/python3.6/dist-packages/tensorflow/python/distribute/distribute_lib.py:2649 _call_for_each_replica
return fn(*args, **kwargs)
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training.py:543 train_step **
self.compiled_metrics.update_state(y, y_pred, sample_weight)
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/compile_utils.py:406 update_state
metric_obj.update_state(y_t, y_p)
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/utils/metrics_utils.py:90 decorated
update_op = update_state_fn(*args, **kwargs)
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/metrics.py:2083 update_state
label_weights=label_weights)
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/utils/metrics_utils.py:351 update_confusion_matrix_variables
y_pred.shape.assert_is_compatible_with(y_true.shape)
/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/tensor_shape.py:1117 assert_is_compatible_with
raise ValueError("Shapes %s and %s are incompatible" % (self, other))
ValueError: Shapes (None, 50) and (None, 1) are incompatible
为什么它在Jupyter中起作用而在Colab中不起作用?感谢您的关注。
答案 0 :(得分:0)
就我而言,我卸载了tensorflow
,然后安装了tensorflow-gpu
,问题就解决了
答案 1 :(得分:0)
我已经在将运行时设置为GPU。如果我将最后一层不是一个节点的密集层(用于二进制分类),而是一个节点的LSTM层作为最后一层,则它可以工作。也许是因为LSTM和Dense不应该混合使用。 谢谢您的答复。