Question

我正在使用预先设定的矢量来创建这样的嵌入

import numpy
import gensim
import tensorflow
ft_model=gensim.models.KeyedVectors.load_word2vec_format("ft_model.vec")
vocabulary=ft_model.vocab
embeddings=numpy.array([ft_model.word_vec(x) for x in vocabulary.keys()])

vocabulary_size=len(vocabulary)
embedding_size=embeddings.shape[1]

W=tensorflow.Variable(
    tensorflow.constant(0.0, shape=[vocabulary_size, embedding_size]),
    trainable=False,
    name="W"
)
embedding_placeholder=tensorflow.placeholder(
    tensorflow.float32,[vocabulary_size,embedding_size],
    name="fasttext_vector"
)
embedding_init=W.assign(embedding_placeholder)
data_placeholder=tensorflow.placeholder(tensorflow.int32,shape=[None, max_length])
embedding_layer=tensorflow.nn.embedding_lookup(W, data_placeholder)

在短暂运行1或2个训练批次并且代码完全崩溃后，我收到错误！

ResourceExhaustedError (see above for traceback): OOM when allocating tensor with shape[5000,14621,100]

堆栈跟踪清除表明这是由embedding_layer=tensorflow.nn.embedding_lookup(W, data_placeholder)行引起的。知道是什么原因引起的吗？ 100是嵌入大小，但那些其他数字（5000,14621）相当奇怪，比我执行的大，并且似乎导致TensorFlow完全咀嚼所有GPU内存！嵌入查找看起来像是一个常见的东西，我加入的.vec文件非常小。

Answer 1

最近我遇到了这个错误，

ResourceExhaustedError（请参阅上面的回溯）：分配具有形状的张量[3039345,400]时为OOM

我遵循了该线程中的指令，如下所示：

WE = tf.Variable(tf.constant(0.0, shape=[vocab_size, numDimensions]), trainable=False, name="WE")
embedding_placeholder = tf.placeholder(tf.float32, [vocab_size, numDimensions])
embedding_init = WE.assign(embedding_placeholder)

sess = tf.Session()
sess.run(embedding_init, feed_dict={embedding_placeholder: wordVectors})

但是没有运气。我有NVIDIA GTX-1070 8Gb显卡。我的批处理大小为24，句子长度为11。

Answer 2

可能是您的计算机内存不足（RAM）。在启动模型之前，请先看一下任务管理器。我有16 GB，使用率为79％，因此用完了。准备好数据后，使用jupyter笔记本查看Ram剩余量可能会有所帮助

执行embedding_lookup时TensorFlow耗尽内存（ResourceExhaustedError）

2 个答案: