机器:Windows 7 - 64位 R版本:R版本3.1.2(2014-10-31) - “南瓜头盔”
我正在努力为我正在进行的分析编制一些文本,我能够一直完成所有工作,直到'stemComplete'为了更多背景,请参阅下面的内容;
的软件包:
test <- as.vector(c('win', 'winner', 'wins', 'wins', 'winning'))
Test_Corpus <- Corpus(VectorSource(test))
Test_Corpus <- tm_map(Survey_Corpus, content_transformer(tolower))
Test_Corpus <- tm_map(Survey_Corpus, removePunctuation)
Test_Corpus <- tm_map(Survey_Corpus, removeNumbers)
>Test_stem <- tm_map(Test_Corpus, stemDocument, language = 'english' )
>Test_complete <- tm_map(Test_stem, stemCompletion, Test_Corpus)
1: In grep(sprintf("^%s", w), dictionary, value = TRUE) :
argument 'pattern' has length > 1 and only the first element will be used
2: In grep(sprintf("^%s", w), dictionary, value = TRUE) :
argument 'pattern' has length > 1 and only the first element will be used
3: In grep(sprintf("^%s", w), dictionary, value = TRUE) :
argument 'pattern' has length > 1 and only the first element will be used
4: In grep(sprintf("^%s", w), dictionary, value = TRUE) :
argument 'pattern' has length > 1 and only the first element will be used
5: In grep(sprintf("^%s", w), dictionary, value = TRUE) :
argument 'pattern' has length > 1 and only the first element will be used
我已经尝试过以前帖子中列出的几件事,看到其他有同样问题的人试过没有运气。以下是这些事项的清单:
答案 0 :(得分:0)
我认为你需要在词干过程之前将test_corpus保存为字典。您可以尝试类似Test_Corpus <- corpus
之类的内容,然后您可以在Test_complete <- tm_map(corpus, stemCompletion)
中开始使用语料库并使用语料库。