我在python中有以下功能
def find_new_similar(tfidf_matrix2, index, tfidf_matrix, top_n = 1):
cosine_similarities = linear_kernel(tfidf_matrix2[index:index+1], tfidf_matrix).flatten()
related_docs_indices = [i for i in cosine_similarities.argsort()[::-1] if i != index]
return [(index, cosine_similarities[index]) for index in related_docs_indices][0:top_n]
在打电话给我的时候我会得到类似的东西:
>>> find_new_similar(tfidf_matrix2, 40, tfidf_matrix)
([(260816, 0.55759049663331683)])
这是索引的related_docs_indices
的索引和cosine_similarities
的结果,作为我的函数的输入。我还想返回传递给函数的初始索引i
。我试过了:
def find_new_similar(tfidf_matrix2, index, tfidf_matrix, top_n = 1):
cosine_similarities = linear_kernel(tfidf_matrix2[index:index+1], tfidf_matrix).flatten()
related_docs_indices = [i for i in cosine_similarities.argsort()[::-1] if i != index]
return [(index, cosine_similarities[index]) for index in related_docs_indices][0:top_n], index
即。只需将,index
添加到return命令的末尾。但这输出:
>>> find_new_similar(tfidf_matrix2, 40, tfidf_matrix)
([(260816, 0.55759049663331683)], 0)
但我实际上是在期待
([(260816, 0.55759049663331683)], 40)
提前致谢
答案 0 :(得分:1)
索引的值正在列表推导中被替换(以下代码中的**)!
return [(index, cosine_similarities[index]) for **index** in related_docs_indices][0:top_n], index
因此,重命名迭代变量将获得所需的结果!
return [(i, cosine_similarities[i]) for i in related_docs_indices][0:top_n], index