尝试在R中创建Document术语矩阵时出错

时间:2016-08-02 17:49:08

标签: r text-mining

我有以下代码但是在尝试创建文档术语矩阵时出现错误:(最初我将数据放在带有一列的csv文件中,并且执行了read.csv,但出于复制的目的,我创建了一个下面的数据框)

library(tm)
TEXTS<- as.data.frame(c("I am a cat person", "I like both cats and dogs"), stringsAsFactors = FALSE)
docs<-VCorpus(VectorSource(TEXTS))
docs <- tm_map(docs, removePunctuation) 
docs <- tm_map(docs, removeNumbers) 
docs <- tm_map(docs, content_transformer(tolower), lazy = TRUE)   
docs <- tm_map(docs, PlainTextDocument, lazy = TRUE) 
docs <- tm_map(docs, removeWords, stopwords("english"), lazy = TRUE)  
library(SnowballC)   
docs <- tm_map(docs, stemDocument, language = meta(docs, "english"), lazy = TRUE) 
dtm <- DocumentTermMatrix(docs) 

这是我从最后一行得到的错误:

Error in stemDocument.PlainTextDocument(x, ...) : 
  promise already under evaluation: recursive default argument reference or     earlier problems?
In addition: Warning message:
In stemDocument.PlainTextDocument(x, ...) :
  restarting interrupted promise evaluation

我该怎么办? 感谢

1 个答案:

答案 0 :(得分:0)

你为什么要调用PlainTextDocument函数?我删除了它,我也删除了词干过程语言中的元引用。

我已经重新订购了您的代码,请记住,如果您经常调用具有第一个参数的函数作为输出变量的名称,则可以使用%>%包中的管道dplyr使您的代码看起来更顺畅(https://cran.r-project.org/web/packages/magrittr/vignettes/magrittr.html

library(tm)
library(SnowballC)
library(dplyr) #install it if you don't have this package   

TEXTS<- as.data.frame(c("I am a cat person", "I like both cats and dogs"), stringsAsFactors = FALSE)
docs<-VCorpus(VectorSource(TEXTS))
docs <- tm_map(docs, removePunctuation) %>%
  tm_map(removeNumbers) %>%
  tm_map(content_transformer(tolower), lazy = TRUE) %>%
  tm_map(removeWords, stopwords("english"), lazy = TRUE) %>%
  tm_map(stemDocument, language = c("english"), lazy = TRUE) 
dtm <- DocumentTermMatrix(docs)