大型数据集上的多线程余弦相似度计算

时间:2019-02-06 05:03:06

标签: python machine-learning nlp cosine-similarity

我想使用多线程计算两个矩阵的余弦相似度

from sklearn.feature_extraction.text import CountVectorizer

count_vect = CountVectorizer()
df_series = count_vect.fit_transform(df.Series) 
df_series.shape 

from sklearn.feature_extraction.text import TfidfTransformer

tfidf_transformer = TfidfTransformer() 
df_srtf = tfidf_transformer.fit_transform(df_series) 

from sklearn.metrics.pairwise import cosine_similarity

SR=cosine_similarity(df_srtf)

0 个答案:

没有答案