R:包主题模型:LDA:错误:无效参数

时间:2018-08-16 23:28:54

标签: r lda topic-modeling invalid-argument

我对R的主题模型中的LDA有疑问。 我创建了一个矩阵,其中文档作为行,术语作为列,文档中的术语数分别作为数据框中的值。当我想启动LDA时,出现了一条错误消息,指出"Error in !all.equal(x$v, as.integer(x$v)) : invalid argument type"。数据包含368个术语的1675个文档。我该怎么做才能使代码正常工作?

library("tm")
library("topicmodels")
data_matrix <- data %>%
group_by(documents, terms) %>%
tally %>%
spread(terms, n, fill=0)
doctermmatrix <- as.DocumentTermMatrix(data_matrix, weightTf("data_matrix"))
lda_head <- topicmodels::LDA(doctermmatrix, 10, method="Gibbs")

非常感谢您的帮助!

编辑

 # Toy Data 
    documentstoy <- c(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16) 
    meta1toy <- c(3,4,1,12,1,2,3,5,1,4,2,1,1,1,1,1) 
    meta2toy <- c(10,0,10,1,1,0,1,1,3,3,0,0,18,1,10,10) 
    termstoy <- c("cus","cus","bill","bill","tube","tube","coa","coa","un","arc","arc","yib","yib","yib","dar","dar") 
    toydata <- data.frame(documentstoy,meta1toy,meta2toy,termstoy)

1 个答案:

答案 0 :(得分:0)

所以我看了一下代码内部,显然lda()函数仅接受整数作为输入,因此您必须按如下方式转换类别变量:

library('tm')
library('topicmodels')
documentstoy <- c(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16) 
meta1toy <- c(3,4,1,12,1,2,3,5,1,4,2,1,1,1,1,1) 
meta2toy <- c(10,0,10,1,1,0,1,1,3,3,0,0,18,1,10,10) 
toydata <- data.frame(documentstoy,meta1toy,meta2toy)
termstoy <- c("cus","cus","bill","bill","tube","tube","coa","coa","un","arc","arc","yib","yib","yib","dar","dar") 
toy_unique = unique(termstoy)
for (i in 1:length(toy_unique)){
  A = as.integer(termstoy == toy_unique[i])
  toydata[toy_unique[i]] = A
}
lda_head <- topicmodels::LDA(toydata, 10, method="Gibbs")