使用sparse_placeholder的tensorflow cosine_similarity?

时间:2017-11-25 21:03:05

标签: python tensorflow scikit-learn tf-idf cosine-similarity

我在python中使用tfidf来矢量化两个大的文本语料库,然后计算余弦相似度....

.....但它很慢。

我想使用张量流来进行余弦相似,以加速余弦相似度计算。 TFIDF创建了numpy稀疏矩阵,所以我知道我必须使用sparse_placeholder加载它们:

#tfidf_matrix and tfidf_matrix2 created from fit_transform() from sklearn's TfidfVectorizer 

a = tf.sparse_placeholder(tf.float32, shape=tfidf_matrix.shape, name="input_placeholder_a")
b = tf.sparse_placeholder(tf.float32, shape=tfidf_matrix2.shape, name="input_placeholder_b")
s = tf.losses.cosine_distance(tf.nn.l2_normalize(a, 0), tf.nn.l2_normalize(b, 0), dim=0)

sess=tf.Session()
cos_sim=sess.run(s,feed_dict={a:tfidf_matrix,b:tfidf_matrix2})

但我明白了:

TypeError: Failed to convert object of type <class 'tensorflow.python.framework.sparse_tensor.SparseTensor'> to Tensor. Contents: SparseTensor(indices=Tensor("input_placeholder_a/indices:0", shape=(?, 2), dtype=int64), values=Tensor("input_placeholder_a/values:0", shape=(?,), dtype=float32), dense_shape=Tensor("input_placeholder_a/shape:0", shape=(2,), dtype=int64)). Consider casting elements to a supported type.

运行s = tf.losses.cosine_distance(tf.nn.l2_normalize(a, 0), tf.nn.l2_normalize(b, 0), dim=0)时,即规范化功能正在抱怨

我可以将sparse_placeholders加载到normalize函数吗?

0 个答案:

没有答案