我尝试在python 3中使用lda 1.0.2软件包。附上我的代码片段。 试验: 我不是在创建一个单独的字典,所以不确定为什么我会遇到这个错误。 (大多数链接建议破坏字典)。 我试图创建一个ndarray而不是矩阵。但这给了#34;需要超过0的值才能打开"错误。
dataset = load_files(path, encoding = 'utf-8' )
vectorizer = TfidfVectorizer( max_features=10000,
stop_words='english')
data_vector = vectorizer.fit_transform(dataset.data)
#data_array = numpy.asarray(data_vector)
import lda
model = lda.LDA(n_topics= 5)
data_lda = model.fit(data_vector);
代码在model.fit(...)
失败我是python的新手。如果有人能够解释逻辑,那将是非常有帮助的。感谢。
编辑:附加完整的跟踪。
IndexError : Traceback (most recent call last)
import lda
4 model = lda.LDA(n_topics= 1)
----> 5 data_lda = model.fit(data_vector);
/usr/local/lib/python2.7/dist-packages/lda/lda.pyc in fit(self, X, y)
118 Returns the instance itself.
119 """
--> 120 self._fit(X)
121 return self
122
/usr/local/lib/python2.7/dist-packages/lda/lda.pyc in _fit(self, X)
212 random_state = lda.utils.check_random_state(self.random_state)
213 rands = self._rands.copy()
--> 214 self._initialize(X)
215 for it in range(self.n_iter):
216 # FIXME: using numpy.roll with a random shift might be faster
/usr/local/lib/python2.7/dist-packages/lda/lda.pyc in _initialize(self, X)
255 np.testing.assert_equal(N, len(WS))
256 for i in range(N):
--> 257 w, d = WS[i], DS[i]
258 z_new = i % n_topics
259 ZS[i] = z_new
IndexError: index 0 is out of bounds for axis 0 with size 0'