我写了一个程序来计算word_embeddings:
import tensorflow as tf
import numpy as np
import math
with tf.device('/gpu:2'):
embedding_size = 5
vocab_size = len(vocab)
embeddings = tf.Variable(tf.random_uniform(shape=[vocab_size,embedding_size],minval=-1.0,maxval=1.0))
embed = tf.nn.embedding_lookup(embeddings,train_words_indices)
softmax_weights = tf.Variable(tf.truncated_normal([vocab_size, embedding_size], stddev=1.0 / math.sqrt(embedding_size)))
logits = tf.matmul(embed,tf.transpose(softmax_weights))
loss = tf.reduce_mean(tf.nn.sparse_softmax_cross_entropy_with_logits(logits=logits,labels=train_contexts_indices))
minimizer = tf.train.AdagradOptimizer(0.1).minimize(loss)
# In[5]:
config = tf.ConfigProto(log_device_placement=True,allow_soft_placement=True)
sess = tf.Session(config=config)
tf.initialize_all_variables().run(session=sess)
e,s = sess.run([embed,softmax_weights])
epochs = 100000
for i in range(epochs):
_,l = sess.run([minimizer,loss])
if i%10000==0:
print('loss : ',l)
sess.close()
我有以下问题:
1.我希望我的程序使用特定的GPU(这里是gpu:2),但它在机器上使用全部4个gpus。
它不是很快,它的运行速度与没有gpu的速度相同。
我不确定这里有什么问题。
编辑:根据评论中的建议,我正在添加有关GPUS的信息:
([u'/gpu:0', u'/gpu:1', u'/gpu:2', u'/gpu:3'])
此外,程序打印以下行:
设备映射:
/job:localhost/replica:0/task:0/gpu:0 -> device: 0, name: GeForce GTX 1080, pci bus id: 0000:02:00.0
/job:localhost/replica:0/task:0/gpu:1 -> device: 1, name: GeForce GTX 1080, pci bus id: 0000:03:00.0
/job:localhost/replica:0/task:0/gpu:2 -> device: 2, name: GeForce GTX 1080, pci bus id: 0000:82:00.0
/job:localhost/replica:0/task:0/gpu:3 -> device: 3, name: GeForce GTX 1080, pci bus id: 0000:83:00.0
另一个奇怪的事情是:程序会打印它正在执行的所有操作以及所有操作,它只说gpu:2,但仍然使用其他gpus。例如,程序打印的一些行是:
Adagrad/learning_rate: /job:localhost/replica:0/task:0/gpu:2
Variable_1/Adagrad: /job:localhost/replica:0/task:0/gpu:2
Variable_1/Adagrad/read: /job:localhost/replica:0/task:0/gpu:2
Variable_1/Adagrad/Assign: /job:localhost/replica:0/task:0/gpu:2
Variable/Adagrad: /job:localhost/replica:0/task:0/cpu:0
Variable/Adagrad/read: /job:localhost/replica:0/task:0/cpu:0
Variable/Adagrad/Assign: /job:localhost/replica:0/task:0/cpu:0
gradients/embedding_lookup_grad/ExpandDims: /job:localhost/replica:0/task:0/gpu:2
gradients/embedding_lookup_grad/strided_slice: /job:localhost/replica:0/task:0/gpu:2
gradients/embedding_lookup_grad/concat: /job:localhost/replica:0/task:0/gpu:2