我正在尝试使用Gensim.phrases库来识别文本中的短语。
我使用了以下内容:
bigram = models.Phrases(txt_to_words,min_count=min_count, threshold=threshold,common_terms=common_terms)
我收到错误:
<ipython-input-13-1c8b06a0b078> in words_to_phrases(txt_to_words, min_count, threshold)
33 common_terms=["of", "with", "without", "and", "or", "the", "a","in","to","is","but"]
34
---> 35 bigram = models.Phrases(txt_to_words,min_count=min_count, threshold=threshold,common_terms=common_terms)
36
37 # trigram
TypeError: __init__() got an unexpected keyword argument 'common_terms'
我有最新的gensim软件包2.0+
知道为什么它没有识别common_terms参数吗?
答案 0 :(得分:0)
嗯......最新版本是3.4.0
。
尝试使用gensim
pip install -U gensim
这个玩具示例对我有用:
from gensim.models.phrases import Phrases
txt_to_words = [['first', 'sentence'], ['and', 'second', 'sentence']]
common_terms = ["of", "with", "without", "and", "or", "the", "a","in","to","is","but"]
bigram = Phrases(txt_to_words, min_count=1, common_terms=common_terms)