我一直在使用sklearn.decomposition.LatentDirichletAllocation模块来探索文档语料库。经过一系列反复的训练和调整模型(即添加停用词和同义词,更改主题数)之后,我非常满意并熟悉提炼的主题。下一步,我想将训练有素的模型应用于新的语料库。
是否可以将拟合模型应用于一组新文档以确定主题分布。
我知道这可以在gensim库中进行,您可以在其中训练模型:
from gensim.test.utils import common_texts
from gensim.corpora.dictionary import Dictionary
# Create a corpus from a list of texts
common_dictionary = Dictionary(common_texts)
common_corpus = [common_dictionary.doc2bow(text) for text in common_texts]
lda = LdaModel(common_corpus, num_topics=10)
然后将经过训练的模型应用于新的语料库:
Topic_distribtutions = lda[unseen_doc]
来自:https://radimrehurek.com/gensim/models/ldamodel.html
如何使用LDA的scikit-learn应用程序做到这一点?
答案 0 :(得分:3)
library(shiny)
boxer_ui <- function(id) {
ns <- NS(id)
div(
id,
id = ns("killme"),
style = "background-color:steelblue; font-size: xx-large; color: white")
}
boxer <- function(input, output, session, kill_switch) {
ns <- session$ns
observe({
req(kill_switch())
removeUI(paste0("#", ns("killme")))
})
}
ui <- fluidPage(actionButton("new", "new"),
actionButton("killall", "Kill All"),
actionButton("add5", "Kill All & Add 5"),
fluidRow(id = "content"))
server <- function(input, output, session) {
ids <- reactiveVal(0)
kill_switch <- reactiveVal(FALSE)
handler <- reactiveValues()
add_new <- function() {
kill_switch(FALSE)
ids(ids() + 1)
new_id <- paste0("id", ids())
insertUI("#content", "beforeEnd", boxer_ui(new_id))
handler[[new_id]] <- callModule(boxer, new_id, kill_switch)
}
observeEvent(input$new, {
isolate({
add_new()
})})
observeEvent(input$add5, {
isolate({
kill_switch(TRUE)
replicate(5, add_new())
})})
observeEvent(input$killall, kill_switch(TRUE))
}
shinyApp(ui, server)
不这样做吗?
transform