library(tm)
library(topicmodels)
lda_topicmodel <- model_LDA(dtm, k=20, control=list(seed=1234))
我使用R中的LDA
函数执行了Latent Dirichlet Allocation。现在,LDA
对象格式中有一个S4
。
如何将其转换为R?
中的单词主题矩阵和文档主题矩阵不幸的是,类型&#39; S4&#39;不是子集。因此,我不得不求助于复制数据的子集以供使用。
Topic 1 Topic 2 Topic 3 Topic 4 Topic 5 Topic 6 Topic 7 Topic 8 Topic 9 Topic 10
[1,] "flooding" "beach" "sets" "flooding" "storm" "fwy" "storms" "flooding" "socal" "rain"
[2,] "erosion" "long" "alltime" "just" "flooding" "due" "thunderstorms" "via" "major" "california"
[3,] "cause" "abc7" "rain" "almost" "years" "closures" "flash" "public" "throughout" "nearly"
[4,] "emergency" "day" "slides" "hardcore" "mudslides" "avoid" "continue" "asks" "abc7" "southern"
[5,] "highway" "history" "last" "spun" "snow" "latest" "possible" "call" "streets" "storms"
Topic 11 Topic 12 Topic 13 Topic 14 Topic 15 Topic 16 Topic 17 Topic 18 Topic 19 Topic 20
[1,] "abc7" "abc7" "like" "widespread" "widespread" "across" "rainfall" "flooding" "flooding" "vehicles"
[2,] "beach" "flooding" "closed" "batters" "biggest" "can" "record" "region" "storm" "several"
[3,] "long" "stranded" "live" "california" "evacuations" "stay" "breaks" "reported" "california" "getting"
[4,] "fwy" "county" "raining" "evacuations" "mudslides" "home" "long" "corona" "causes" "floodwaters"
[5,] "710" "san" "blog" "mudslides" "years" "wires" "beach" "across" "related" "stranded"
图片包含每个主题中单词的子集:LDA word-topic 我希望将S4对象的内容写入csv文件,如word-topic矩阵,如下所示: Word-Topic Matrix
答案 0 :(得分:1)
我正在使用R中的一些数据,因为我们无法重现您的数据。
# load the libraries
library(topicmodels)
library(tm)
# load the data we'll be using
data("AssociatedPress")
# estimate a LDA model using the VEM algorithm (default)
# I'll be using the number of k (number of topics) being 2
# just as a example
ap_lda <- LDA(AssociatedPress,
k = 2,
control = list(seed = 1234))
# get all the terms in a dataframe
as.data.frame(terms(ap_lda, dim(ap_lda)[1]))
输出结果为:
Topic 1 Topic 2
1 percent i
2 million president
3 new government
4 year people
5 billion soviet
6 last new