我有以下代码但是在尝试创建文档术语矩阵时出现错误:(最初我将数据放在带有一列的csv文件中,并且执行了read.csv,但出于复制的目的,我创建了一个下面的数据框)
library(tm)
TEXTS<- as.data.frame(c("I am a cat person", "I like both cats and dogs"), stringsAsFactors = FALSE)
docs<-VCorpus(VectorSource(TEXTS))
docs <- tm_map(docs, removePunctuation)
docs <- tm_map(docs, removeNumbers)
docs <- tm_map(docs, content_transformer(tolower), lazy = TRUE)
docs <- tm_map(docs, PlainTextDocument, lazy = TRUE)
docs <- tm_map(docs, removeWords, stopwords("english"), lazy = TRUE)
library(SnowballC)
docs <- tm_map(docs, stemDocument, language = meta(docs, "english"), lazy = TRUE)
dtm <- DocumentTermMatrix(docs)
这是我从最后一行得到的错误:
Error in stemDocument.PlainTextDocument(x, ...) :
promise already under evaluation: recursive default argument reference or earlier problems?
In addition: Warning message:
In stemDocument.PlainTextDocument(x, ...) :
restarting interrupted promise evaluation
我该怎么办? 感谢
答案 0 :(得分:0)
你为什么要调用PlainTextDocument
函数?我删除了它,我也删除了词干过程语言中的元引用。
我已经重新订购了您的代码,请记住,如果您经常调用具有第一个参数的函数作为输出变量的名称,则可以使用%>%
包中的管道dplyr
使您的代码看起来更顺畅(https://cran.r-project.org/web/packages/magrittr/vignettes/magrittr.html)
library(tm)
library(SnowballC)
library(dplyr) #install it if you don't have this package
TEXTS<- as.data.frame(c("I am a cat person", "I like both cats and dogs"), stringsAsFactors = FALSE)
docs<-VCorpus(VectorSource(TEXTS))
docs <- tm_map(docs, removePunctuation) %>%
tm_map(removeNumbers) %>%
tm_map(content_transformer(tolower), lazy = TRUE) %>%
tm_map(removeWords, stopwords("english"), lazy = TRUE) %>%
tm_map(stemDocument, language = c("english"), lazy = TRUE)
dtm <- DocumentTermMatrix(docs)