我想知道为什么当使用Fasttext向量时,相似度函数的结果不同于sameity_by_word函数。例如,similar_by_word识别出“ Dinner”和“ supper”与余弦相似度为0.79密切相关,但是当我运行相似度函数时,它返回的结果为-2意味着它们是相反的。我用于初始化模型和运行功能的代码如下:
# intialize crawl .vec file
crawl_model = KeyedVectors.load_word2vec_format('crawl/'+os.listdir('crawl')[0])
find_similar_to='dinner'
for similar_word in crawl_model.similar_by_word(find_similar_to,topn=3):
print("Word: {0}, Similarity: {1:.2f}".format(
similar_word[0].encode('utf-8'), similar_word[1]
))
# results
#Word: Dinner, Similarity: 0.79
#Word: supper, Similarity: 0.79
#Word: dinners, Similarity: 0.75
print(crawl_model.wv.similarity('supper', 'dinner'))
#result
#-2.0