tf.nn.embedding_lookup花费了太多时间

时间:2018-04-10 19:03:51

标签: python tensorflow deep-learning word2vec word-embedding

我正在尝试使用tensorflow embedding_lookup进行GloVe单词嵌入,但是tensorflow花费了太多时间,例如:

data="""This isn't the comedic Robin Williams, nor is it the quirky/insane Robin Williams of recent thriller fame. This is a hybrid of the classic drama without over-dramatization, mixed with Robin's new love of the thriller. But this isn't a thriller, per se. This is more a mystery/suspense vehicle through which Williams attempts to locate a sick boy and his keeper.<br /><br />Also starring Sandra Oh and Rory Culkin, this Suspense Drama plays pretty much like a news report, until William's character gets close to achieving his goal.<br /><br />I must say that I was highly entertained, though this movie fails to teach, guide, inspect, or amuse. It felt more like I was watching a guy (Williams), as he was actually performing the actions, from a third person perspective. In other words, it felt real, and I was able to subscribe to the premise of the story.<br /><br />All in all, it's worth a watch, though it's definitely not Friday/Saturday night fare.<br /><br />It rates a 7.7/10 from...<br /><br />the Fiend :."""

清除此文本后,使用GloVe转换为矢量形式,它看起来像这样:

data=[37, 228955, 0, 21698, 4972, 1317, 2094, 14, 20, 0, 17151, 14916, 4972, 1317, 3, 397, 8965, 3152, 37, 14, 7, 7008, 3, 0, 2392, 2692, 296, 74, 61025, 2195, 17, 29196, 50, 835, 3, 0, 8965, 34, 37, 228955, 7, 8965, 532, 7366, 37, 14, 56, 7, 5351, 17495, 1907, 131, 42, 1317, 2444, 4, 10170, 7, 4478, 1606, 5, 26, 7529, 52, 3935, 10105, 3202, 5, 18534, 52493, 37, 17495, 2692, 1381, 1922, 181, 117, 7, 172, 255, 207, 1317, 1395, 1666, 383, 4, 7090, 26, 715, 41, 390, 203, 12, 41, 15, 1786, 18915, 413, 37, 1005, 6186, 4, 5293, 3372, 12000, 46, 46183, 20, 1349, 56, 117, 41, 15, 2641, 7, 1856, 1317, 19, 18, 15, 1403, 3349, 0, 1970, 25, 7, 245, 899, 5251, 6, 68, 1374, 20, 1349, 567, 5, 41, 15, 667, 4, 20769, 4, 0, 11932, 3, 0, 523, 64, 6, 64, 47, 1089, 7, 1716, 413, 47, 3936, 36, 185, 277, 364, 6769, 20, 864, 7, 5599, 206, 25, 0, 53727]

现在如果我正在尝试使用tensorflow embedding_looup:

input_x=tf.placeholder(tf.int32,shape=[None,None])
data_x=np.load('word_embedding_GloVe.npy')

embedding=tf.nn.embedding_lookup(data_x,input_x)

with tf.Session() as sess:
    print(sess.run(embedding,feed_dict={input_x:[data]}))

但它需要更多时间,而手动我可以使用numpy和index方法更快地完成。我在哪里做错了?

0 个答案:

没有答案