TypeError:'>' ' float'的实例之间不支持和' NoneType'

时间:2018-03-27 13:24:38

标签: python gensim lda

我使用gensim库训练了一个LDA模型,我用它来提取文档的主题向量,我使用下面的代码

def clean_doc(data_string):    
    global en_stop
    tokenizer = RegexpTokenizer(r'\w+') #Create appropriate tokenizer
    p_stemmer = PorterStemmer() #Create object from Porter Stemmer
    #clean and tokenize document string
    raw = data_string.lower()
    tokens = tokenizer.tokenize(raw)
    # remove stop words from tokens
    stopped_tokens = [i for i in tokens if not i in en_stop]
    # stem tokens
    stemmed_tokens = [p_stemmer.stem(i) for i in stopped_tokens]
    return stemmed_tokens

def infer_lda_vector(s, dictionary, model, dimensions):
    #s = s.decode('utf-8')
    vector = [0.0]*dimensions
    s = clean_doc(s)
    bow_vector = dictionary.doc2bow(s)   
    lda_vector = model[bow_vector]            
    for i in lda_vector:
        vector[i[0]] = i[1]
    return vector

我称之为:

text = "this a test"
lda_vector = infer_lda_vector(text, dictionary, lda_model, 300)

当我使用Python2.7时,这段确切的代码正在运行,但当我将系统更新为Python3.x时,它会抛出以下错误:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-36-723f03d03620> in <module>()
      1 text = "this a a test"
----> 2 lda_vector = infer_lda_vector(text, dictionary, lda_model, 300)
      3 lda_vector

<ipython-input-34-885205b68d9e> in infer_lda_vector(s, dictionary, model, dimensions)
     34     s = clean_doc(s)
     35     bow_vector = dictionary.doc2bow(s)
---> 36     lda_vector = model[bow_vector]
     37     for i in lda_vector:
     38         vector[i[0]] = i[1]

C:\ProgramData\Anaconda3\lib\site-packages\gensim\models\ldamodel.py in __getitem__(self, bow, eps)
   1158             `(topic_id, topic_probability)` 2-tuples.
   1159         """
-> 1160         return self.get_document_topics(bow, eps, self.minimum_phi_value, self.per_word_topics)
   1161 
   1162     def save(self, fname, ignore=('state', 'dispatcher'), separately=None, *args, **kwargs):

C:\ProgramData\Anaconda3\lib\site-packages\gensim\models\ldamodel.py in get_document_topics(self, bow, minimum_probability, minimum_phi_value, per_word_topics)
    979         if minimum_probability is None:
    980             minimum_probability = self.minimum_probability
--> 981         minimum_probability = max(minimum_probability, 1e-8)  # never allow zero values in sparse output
    982 
    983         if minimum_phi_value is None:

TypeError: '>' not supported between instances of 'float' and 'NoneType'

我做错了什么?

1 个答案:

答案 0 :(得分:0)

用conda清洁并重新安装它。

conda clean -t
conda install gensim

我猜测安装了损坏的版本,并且在重新安装之前clean命令将其删除。