可视化LDA主题模型时出错

时间:2018-05-11 10:55:48

标签: python nltk lda topic-modeling

我想解释我的lda主题模型中的主题,所以我使用pyldavis .. 但不知何故,我无法运行pyldavis。这是代码:

import gensim
from gensim import corpora
from gensim.corpora import Dictionary


dictionary = corpora.Dictionary(lemmatized_list)
print(dictionary)
print(dictionary.token2id)
corpus = [dictionary.doc2bow(text) for text in lemmatized_list]
print(corpus)

ldamodel = gensim.models.ldamodel.LdaModel(corpus, num_topics=5, id2word = 
dictionary, passes=10)
print(ldamodel.print_topics(num_topics=5, num_words=3))

import pyLDAvis.gensim
pyLDAvis.enable_notebook()
pyLDAvis.gensim.prepare(ldamodel, corpus, dictionary)

然后,在我到达代码的最后一部分后,我必须使用pyldavis进行可视化,它显示以下错误:

    ---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-16-a16bc334f38f> in <module>()
      1 import pyLDAvis.gensim
      2 pyLDAvis.enable_notebook()
----> 3 pyLDAvis.gensim.prepare(ldamodel, corpus, dictionary)
      4 term_ix = np.sort(topic_info.index.unique().values)

~\AppData\Local\Continuum\anaconda3\lib\site-packages\pyLDAvis\gensim.py in prepare(topic_model, corpus, dictionary, doc_topic_dist, **kwargs)
    108     """
    109     opts = fp.merge(_extract_data(topic_model, corpus, dictionary, doc_topic_dist), kwargs)
--> 110     return vis_prepare(**opts)

~\AppData\Local\Continuum\anaconda3\lib\site-packages\pyLDAvis\_prepare.py in prepare(topic_term_dists, doc_topic_dists, doc_lengths, vocab, term_frequency, R, lambda_step, mds, n_jobs, plot_opts, sort_topics)
    396 
    397    topic_info         = _topic_info(topic_term_dists, topic_proportion, term_frequency, term_topic_freq, vocab, lambda_step, R, n_jobs)
--> 398    token_table        = _token_table(topic_info, term_topic_freq, vocab, term_frequency)
    399    topic_coordinates = _topic_coordinates(mds, topic_term_dists, topic_proportion)
    400    client_topic_order = [x + 1 for x in topic_order]

~\AppData\Local\Continuum\anaconda3\lib\site-packages\pyLDAvis\_prepare.py in _token_table(topic_info, term_topic_freq, vocab, term_frequency)
    265    # term-topic frequency table of unique terms across all topics and all values of lambda
    266    term_ix = topic_info.index.unique()
--> 267    term_ix.sort()
    268    top_topic_terms_freq = term_topic_freq[term_ix]
    269    # use the new ordering for the topics

~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\indexes\base.py in sort(self, *args, **kwargs)
   2098 
   2099     def sort(self, *args, **kwargs):
-> 2100         raise TypeError("cannot sort an Index object in-place, use "
   2101                         "sort_values instead")
   2102 

TypeError: cannot sort an Index object in-place, use sort_values instead

任何有关如何解决此错误的建议......都会非常有用。谢谢!

1 个答案:

答案 0 :(得分:0)

此错误之前已出现,并且在某些版本中已被识别为Pandas和PyLDAvis之间的不兼容性。

Here他们声称某个特定版本应该修复它。

  

在版本2.1.0中修复了#76的重复。安装特定的   版本pip安装pyldavis == 2.1.0

我虽然没有检查过。