我使用TensorFlow 1.9来训练带有一些玩具数据的简单神经网络,以尝试了解TensorFlow如何在GPU中分配内存。我的显卡是NVIDIA GeForce GTX 780 Ti,它具有3GB的GPU内存。
在我的代码中,我创建了数据并设置了批处理大小,以使一个批处理占用的内存量为4GB。通过打印出包含此数据的NumPy数组的字节数,可以在代码中对此进行验证。
运行此代码时,我收到以下警告消息,该消息在每批中打印3次:
2018-08-03 14:24:50.021264: W tensorflow/core/framework/allocator.cc:108] Allocation of 4000000000 exceeds 10% of system memory.
对此,我有两个问题:
1)此警告消息是什么意思?占内存的10%? GPU内存?
2)如何将一批4GB的容量放入3GB的GPU中?。批处理是否分为子批处理,并且每个批处理都通过GPU独立发送?
如果有兴趣,那么我的完整代码如下:
# Python imports
import numpy as np
# Tensorflow imports
import tensorflow as tf
# Set some parameters
np.random.seed(0)
num_examples = 2000000
input_size = 1000
num_training_examples = int(0.8 * num_examples)
num_validation_examples = int(0.2 * num_examples)
batch_size = 1000000
# Create the data
print('Creating data')
input_data = np.random.rand(num_examples, input_size).astype(np.float32)
label_data = np.random.rand(num_examples, 1).astype(np.float32)
training_input_data = input_data[:num_training_examples]
training_label_data = label_data[:num_training_examples]
validation_input_data = input_data[num_training_examples:]
validation_label_data = label_data[num_training_examples:]
print('Data created')
# Get the memory for the data
data_memory = training_input_data.nbytes + training_label_data.nbytes + validation_input_data.nbytes + validation_label_data.nbytes
data_memory /= 1e6
print('Dataset memory = ' + str(data_memory) + ' MB')
example_memory = training_input_data[0].nbytes + training_label_data[0].nbytes
batch_memory = example_memory * batch_size
batch_memory /= 1e6
print('Batch memory = ' + str(batch_memory) + ' MB')
# Create the placeholders
input_placeholder = tf.placeholder(dtype=np.float32, shape=[None, input_size])
label_placeholder = tf.placeholder(dtype=np.float32, shape=[None, 1])
# Create the network
x = tf.layers.dense(inputs=input_placeholder, units=input_size, activation=tf.nn.relu)
x = tf.layers.dense(inputs=x, units=50, activation=tf.nn.relu)
x = tf.layers.dense(inputs=x, units=50, activation=tf.nn.relu)
x = tf.layers.dense(inputs=x, units=50, activation=tf.nn.relu)
predictions = tf.layers.dense(inputs=x, units=1)
# Define the loss
loss_op = tf.reduce_mean(tf.square(label_placeholder - predictions))
# Define the optimiser
train_op = tf.train.AdamOptimizer(learning_rate=0.001).minimize(loss_op)
# Run a TensorFlow session
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
# Loop over epochs
num_training_batches = int(num_training_examples / batch_size)
num_validation_batches = int(num_validation_examples / batch_size)
training_losses = []
validation_losses = []
for epoch_num in range(1000):
print('epoch ' + str(epoch_num))
# Training
batch_loss_sum = 0
for batch_num in range(num_training_batches):
print('batch ' + str(batch_num))
batch_inputs = training_input_data[batch_num * batch_size: (batch_num + 1) * batch_size]
batch_labels = training_label_data[batch_num * batch_size: (batch_num + 1) * batch_size]
batch_loss, _ = sess.run([loss_op, train_op], feed_dict={input_placeholder: batch_inputs, label_placeholder: batch_labels})
batch_loss_sum += batch_loss
training_loss = batch_loss_sum / num_training_batches
# Validation
batch_loss_sum = 0
for batch_num in range(num_validation_batches):
batch_inputs = validation_input_data[batch_num * batch_size: (batch_num + 1) * batch_size]
batch_labels = validation_label_data[batch_num * batch_size: (batch_num + 1) * batch_size]
batch_loss, _ = sess.run([loss_op, train_op], feed_dict={input_placeholder: batch_inputs, label_placeholder: batch_labels})
batch_loss_sum += batch_loss
validation_loss = batch_loss_sum / num_validation_batches