我正在尝试使用Gensim创建主题模型。
为此,我使用LdaMallet函数
from gensim.models.wrappers import LdaMallet
此外,在获得正确格式的数据后,我使用以下代码:
mallet_path = 'D:/Emil/Onderzoek/Eerste onderzoek/Python/Mallet/mallet-2.0.8/bin/mallet'
def compute_coherence_values(dictionary, corpus, texts, limit, start=6, step=2):
coherence_values = []
model_list = []
for num_topics in range(start, limit, step):
model = gensim.models.wrappers.LdaMallet(mallet_path, corpus=corpus_tfidf, num_topics=num_topics, id2word=id2word2)
model_list.append(model)
coherencemodel = CoherenceModel(model=model, texts=texts, dictionary=dictionary, coherence='c_v')
coherence_values.append(coherencemodel.get_coherence())
return model_list, coherence_values
然后使用以下命令调用此函数:
model_list, coherence_values = compute_coherence_values(dictionary=id2word2, corpus=corpus_tfidf, texts=texts, start=8, limit=24, step=4)
但是,这将返回以下错误:
Traceback (most recent call last):
File "<ipython-input-344-0d49bf93f86b>", line 1, in <module>
model_list, coherence_values = compute_coherence_values(dictionary=id2word2, corpus=corpus_tfidf, texts=texts, start=8, limit=24, step=4)
File "<ipython-input-343-9691c04ba857>", line 20, in compute_coherence_values
model = gensim.models.wrappers.LdaMallet(mallet_path, corpus=corpus_tfidf, num_topics=num_topics, id2word=id2word2)
File "D:\Apps\Anaconda3\lib\site-packages\gensim\models\wrappers\ldamallet.py", line 132, in __init__
self.train(corpus)
File "D:\Apps\Anaconda3\lib\site-packages\gensim\models\wrappers\ldamallet.py", line 273, in train
self.convert_input(corpus, infer=False)
File "D:\Apps\Anaconda3\lib\site-packages\gensim\models\wrappers\ldamallet.py", line 262, in convert_input
check_output(args=cmd, shell=True)
File "D:\Apps\Anaconda3\lib\site-packages\gensim\utils.py", line 1918, in check_output
raise error
CalledProcessError: Command 'D:/Emil/Onderzoek/Eerste onderzoek/Python/Mallet/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --remove-stopwords --token-regex "\S+" --input C:\Users\E26E5~1.RIJ\AppData\Local\Temp\9\b77163_corpus.txt --output C:\Users\E26E5~1.RIJ\AppData\Local\Temp\9\b77163_corpus.mallet' returned non-zero exit status 1.
谁可以告诉我该怎么做,这样我就不会收到此错误?
我在下面发现了发生相同错误的问题。但是,那是通过使用不同的程序包,所以我不知道该如何解决这个问题。