使用R(3.2.5)并加载以下包 'SnowballC','tm','NLP','RWeka','RTextTools','wordcloud','fpc'
carmenCorpus <- Corpus(VectorSource(feedback$Description))
carmenCorpus <- tm_map(carmenCorpus, PlainTextDocument)
carmenCorpus <- tm_map(carmenCorpus, removePunctuation)
carmenCorpus <- tm_map(carmenCorpus, removeWords, stopwords('english'))
carmenCorpus <- tm_map(carmenCorpus, stemDocument)
当我创建wordcloud时,我收到以下错误。这是一个新错误,几个月前代码运行时没有问题:
wordcloud(carmenCorpus, max.words = 100, random.order = FALSE)
# Error in simple_triplet_matrix(i, j, v, nrow = length(terms), ncol = length(corpus), :
# 'i, j' invalid
请就此问题提出建议。
答案 0 :(得分:0)
wordcloud
不能仅仅使用语料库并且神奇地生成一个wordcloud。
你必须努力将其转换为TextDocumentMatrix
然后总结单词频率:
# convert to TDM
tdm <- TermDocumentMatrix(carmenCorpus, control=list(stemming=True))
# calculate word frequencies
freqs = sort(rowSums(as.matrix(tdm)), decreasing=TRUE)
# plot wordcloud
wordcloud(names(freqs), freqs,
max.words = 100,
random.order = FALSE,
# any other params you want to pass into wordcloud
)