K-Means Cluster extract term using r

时间:2019-05-31 11:56:16

标签: r cluster-analysis term

I am trying to extract top terms in each cluster , below is code which i used for k-means cluster.

Data <- read.csv("Task.csv",header=T,na.strings=c(""))

Desc <- subset(Data,select=c("text"))

library(tm)
library(stringi)
library(proxy)

docs = Corpus(VectorSource(Desc$text))


docs2 = tm_map(docs, function(x) stri_replace_all_regex(x, "<.+?>", " "))
docs3 = tm_map(docs2, function(x) stri_replace_all_fixed(x, "t", " "))
docs4 = tm_map(docs3, PlainTextDocument)
docs5 = tm_map(docs4, stripWhitespace)
docs6 = tm_map(docs5, removeWords, stopwords("english"))
docs7 = tm_map(docs6, removePunctuation)
docs8 = tm_map(docs7, content_transformer(tolower))
dtm <- DocumentTermMatrix(docs8)
dtm2 <- as.matrix(dtm)
dim(dtm2)


fit <- kmeans(dtm2,20)
fit

Is there any way i can get top terms in cluster using in R I tried it in python which is doable but not able to found any solution in R

0 个答案:

没有答案