我正在尝试从句子列表中生成向量。
x1 = 'Today I’d like to start a series of some posts concerning extreme value analysis using R.'
x2 = 'Basically, there are several very useful packages in R which provide methods and functions for extreme value analysis. Information on different software (including all relevant R packages) for extreme value analysis can of course be found at the R Task View on Extreme Value Analysis as well as on Eric Gilleland’s website'
x3 = 'In addition, Gilleland, Ribatet & Stephenson have published A software review for extreme value analysis back in 2012, which provides a comprehensive overview of the most important software tools related to this topic.'
self.sentences = [x1, x2, x3]
然后:
documents = []
for uid, line in enumerate(self.sentences):
documents.append(LabeledSentence(line.split(), 'LOG_' + str(uid)))
self.model_d2v = Doc2Vec(alpha=0.025, min_alpha=0.025, workers = self.workers, size = self.size)
self.model_d2v.build_vocab(documents)
for epoch in range(20):
self.model_d2v.train(documents)
self.model_d2v.alpha -= 0.002
self.model_d2v.min_alpha = self.model_d2v.alpha
然后我出现错误:
RuntimeError: you must first build vocabulary before training the model
在train(documents)
行。
我不知道,因为我刚才打过build_vocab
。
能给我一些提示吗?