R RKEA - 没有足够的带有类标签的训练实例(必需:1,提供:0)!

时间:2017-10-17 14:01:23

标签: r keyword extraction tm corpus

我正在努力让RKEA在R Studio中工作。这是我目前的代码:

#Imports packages
library(RKEA)
library(tm)

#Creates a corpus of training sentences
data <- c("This is a sentence",
          "I am in an office",
          "I'm working on a laptop",
          "I have a glass of water",
          "There is a wooden desk",
          "I have an apple for lunch")
data <- as.data.frame(data)
data <- Corpus(VectorSource(data$data))

#Creates a corpus of training keywords
keywords <- c("sentence",
              "office",
              "working",
              "glass",
              "wooden",
              "apple")
keywords <- as.data.frame(keywords)
keywords <- Corpus(VectorSource(keywords$keywords))

#Creates output file for created model
tmpdir <- tempfile()
dir.create(tmpdir)
model <- file.path(tmpdir, "MyModel")

#Creates RKEA model
createModel(data, keywords, model)

这主要是在RKEA文档中给出的示例之后建模的。但是,当我运行它时,我收到以下错误消息:

Error in .jcall(km, "V", "saveModel") : 
  weka.core.WekaException: weka.classifiers.bayes.NaiveBayesSimple: Not enough training instances with class labels (required: 1, provided: 0)!

1 个答案:

答案 0 :(得分:5)

我认为你的例句与文档一样短。以下修改(主要针对第一个示例文档)可以正常工作:

data <- c("This is a longer and longer sentence.",
      "I am in an office.",
      "I'm working on a laptop.",
      "I have a glass of water.",
      "There is a wooden desk.",
      "I have an apple for lunch.")

我的猜测是,如果句子非常简短,那么就没有足够的单词可用于构建模型。