tfvect = TfidfVectorizer(use_idf=True, stop_words = 'english')
wholeword = df_all['search_term']+" "+df_all['product_title']
vocab = tfvect.fit_transform(wholeword)
st = tfvect.transform(df_all['search_term'])
pt = tfvect.transform(df_all['product_title'])
我想获得st和pt的每一行之间的余弦相似性,并将其存储在df_all ['相似性']中。
答案 0 :(得分:0)
对每对行使用scipy.distance.cosine(v1,v2) http://docs.scipy.org/doc/scipy-0.14.0/reference/generated/scipy.spatial.distance.cosine.html