我有以下数据,首先为它们创建一个文档术语矩阵,然后将时间变量添加到dtm中。为了了解不同年份中与“ updat”一词的相关性,我想以x轴显示时间的方式为findAssoc提供一个热图。
library(tm)
library(ggplot2)
df <- structure(list(Description = structure(c(5L, 8L, 6L, 4L, 1L,
2L, 7L, 9L, 10L, 3L), .Label = c("general topics done", "keep the general topics updated",
"rejected topic ", "several topics in hand", "this is a genetal topic",
"topic 333555 needs to be updated", "topic 5647 is handed over",
"topic is updated", "update the topic ", "updating the topic is done "
), class = "factor")), class = "data.frame", row.names = c(NA,
-10L))
corpus=Corpus(VectorSource(df$Description))
corpus=tm_map(corpus,tolower)
corpus=tm_map(corpus,removePunctuation)
corpus=tm_map(corpus,removeWords,c(stopwords("english")))
corpus=tm_map(corpus,stemDocument,"english")
frequenciescontrol=DocumentTermMatrix(corpus)
frequenciescontrol$time=c("2015","2015","2015","2015","2015","2016","2016","2016","2016","2016")
findAssocs(frequenciescontrol, "updat", 0.01)
所以我想要一个热图,其中y轴显示与“ updat”相关的单词,x轴显示年份。