我确实在我的数据集中进行了情绪分析,并收到了此错误消息
“结构错误(if(length(n))n else NA,names = x): 'names'属性[2]的长度必须与vector [1]“
的长度相同请帮忙!
myCorpus<-Corpus(VectorSource(Datasetlow_cost_airline$text))
# Convert to lower case
myCorpus<-tm_map(myCorpus,tolower)
# Remove puntuation
myCorpus<-tm_map(myCorpus,removePunctuation)
# Remove numbers
myCorpus<-tm_map(myCorpus,removeNumbers)
# Remove URLs ?regex = regular expression ?gsub = pattern matching
removeURL<-function(x)gsub("http[[:alnum:]]*","",x)
myCorpus<-tm_map(myCorpus,removeURL)
stopwords("english")
# Add two extra stop words: 'available' and 'via'
myStopwords<-c(stopwords("english"),"available","via","can")
# Remove stopwords from corpus
myCorpus<-tm_map(myCorpus,removeWords,myStopwords)
# Keep a copy of corpus to use later as a dictionary for stem completion
myCorpusCopy<-myCorpus
# Stem word (change all the words to its root word)
myCorpus<-tm_map(myCorpus,stemDocument)
# Inspect documents (tweets) numbered 11 to 15
for(i in 11:15){
cat(paste("[[",i,"]]",sep=""))
writeLines(strwrap(myCorpus[[i]],width=73))
}
# Stem completion
myCorpus<-tm_map(myCorpus,stemCompletion,dictionary=myCorpusCopy)
答案 0 :(得分:1)
stemCompletion
版本0.6中的tm
函数似乎有些奇怪。我用here有一个很好的解决方法this answer。简而言之,替换你的
# Stem completion
myCorpus <- tm_map(myCorpus, stemCompletion, dictionary = myCorpusCopy) # use spaces!
与
# Stem completion
stemCompletion_mod <- function(x,dict) {
PlainTextDocument(stripWhitespace(paste(stemCompletion(unlist(strsplit(as.character(x)," ")), dictionary = dict, type = "shortest"), sep = "", collapse = " ")))
}
# apply workaround function
myCorpus <- lapply(corpus, stemCompletion_mod, myCorpusCopy)
如果这没有帮助,那么您需要提供更多详细信息和实际数据样本。