我有一个包含1200个句子的列表。我想计算一个列表中一个句子的Jaccard系数,后面跟着所有其他句子。 就像sent1将与sent2,3,...然后sent2与sent3,4,... 我已经有一个功能,需要2套并返回Jaccard系数。我只是想知道如何为上面的场景编写python循环。
list_question=[] #This List is later filled with sentences from a file
def jaccard(a,b): # computes Jaccard
c=a.intersection(b)
return float(len(c))/(len(a)+len(b)-len(c))
# ....Here i want to write the loop to compute the jaccard of sentences as explained in the question
我想形成一组基于Jaccard Coeff得分相似的句子> 0.5
由于
答案 0 :(得分:0)
您可以像这样使用itertools.combination:
import itertools
def do_some_stuff(first, second):
print('comapring', first, 'to', second)
sentences_list = ['fisrt', 'second', 'third', 'forth']
combinations = itertools.combinations(sentences_list, 2)
for first, second in combinations:
do_some_stuff(first, second)
上面的代码段会为您提供此输出:
comapring fisrt to second
comapring fisrt to third
comapring fisrt to forth
comapring second to third
comapring second to forth
comapring third to forth