Question

我目前正在将我的第一个RNN + LSTM模型放在TensorFlow中。

我想尝试对一些“实时”传感器数据进行训练，但鉴于我在网上找到的例子，我不清楚如何解决我的训练更新非常缓慢的问题。 / p>

我认为正在发生的事情是循环中的每次迭代，TensorFlow打开/创建一个新的TF设备（在这种情况下是我的两个GPU），然后在处理单个样本后关闭它们，然后重新打开...无论批量大小如何，这都会使每个培训周期增加约3秒钟。

以下是一些代码，说明我如何配置估算器以及如何在循环中调用它：

from tensorflow.contrib import learn

regressor = learn.TensorFlowEstimator(model_fn=lstm_model(TIMESTEPS, RNN_LAYERS, DENSE_LAYERS),
                                      n_classes=0,
                                      verbose=0,
                                      batch_size=1,
                                      steps=1,
                                      optimizer='Adagrad',
                                      learning_rate=0.03,
                                      continue_training=True
                                      )
for x in range(10): #loop while sensor data is streaming in
    print(x)
    X, y = get_sensors_data() #get a single sample of sensor data
    regressor.partial_fit(
        X['train'][x].reshape((1,10,1)),
        y['train'][x].reshape((1,1)),
        steps=1
    )
    #predict_next_position()

当上面的代码执行时，我得到一些详细的输出（即使verbose = 0）：

0
INFO:tensorflow:Create CheckpointSaverHook
INFO:tensorflow:Create CheckpointSaverHook
I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:925] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
I tensorflow/core/common_runtime/gpu/gpu_device.cc:951] Found device 0 with properties: 
name: GeForce GTX 1070
major: 6 minor: 1 memoryClockRate (GHz) 1.7845
pciBusID 0000:03:00.0
Total memory: 8.00GiB
Free memory: 7.10GiB
W tensorflow/stream_executor/cuda/cuda_driver.cc:572] creating context when one is currently active; existing: 0x5c2f230
I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:925] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
I tensorflow/core/common_runtime/gpu/gpu_device.cc:951] Found device 1 with properties: 
name: GeForce GTX 1070
major: 6 minor: 1 memoryClockRate (GHz) 1.7845
pciBusID 0000:01:00.0
Total memory: 8.00GiB
Free memory: 7.02GiB
I tensorflow/core/common_runtime/gpu/gpu_device.cc:855] cannot enable peer access from device ordinal 0 to device ordinal 1
I tensorflow/core/common_runtime/gpu/gpu_device.cc:855] cannot enable peer access from device ordinal 1 to device ordinal 0
I tensorflow/core/common_runtime/gpu/gpu_device.cc:972] DMA: 0 1 
I tensorflow/core/common_runtime/gpu/gpu_device.cc:982] 0:   Y N 
I tensorflow/core/common_runtime/gpu/gpu_device.cc:982] 1:   N Y 
I tensorflow/core/common_runtime/gpu/gpu_device.cc:1041] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 1070, pci bus id: 0000:03:00.0)
I tensorflow/core/common_runtime/gpu/gpu_device.cc:1041] Creating TensorFlow device (/gpu:1) -> (device: 1, name: GeForce GTX 1070, pci bus id: 0000:01:00.0)
INFO:tensorflow:loss = 0.285131, step = 1
INFO:tensorflow:loss = 0.285131, step = 1
INFO:tensorflow:Saving checkpoints for 1 into /tmp/tmptouwbf48/model.ckpt.
INFO:tensorflow:Saving checkpoints for 1 into /tmp/tmptouwbf48/model.ckpt.
1
And Repeat...

如果我的方法是培训这种配置的有效方法，我不肯定，但我对这个领域很陌生。任何建议都会受到欢迎！

在循环中使用TensorFlowEstimator的“partial_fit”？（每次迭代都要继续创建新的TensorFlow设备）

0 个答案:

在循环中使用TensorFlowEstimator的“partial_fit”？ （每次迭代都要继续创建新的TensorFlow设备）

0 个答案:

在循环中使用TensorFlowEstimator的“partial_fit”？（每次迭代都要继续创建新的TensorFlow设备）