我认为我犯了一个非常简单的错误,但我是R的新秀,我很难搞清楚。我正在尝试使用R中的'stm'包对我抓取的推文数据集进行一些主题建模。
数据集的格式分为两列,一列是tweet-sender的名称,列标题为“meta”,另一列是tweet的列表,列标题为“vocab”。运行下面的脚本后,我收到以下错误:
Error: is.Source(s) is not TRUE
In addition: Warning message:
In is.Source(s) : vectorized sources must have a positive length entry
library(stm)
library(igraph)
setwd("c:/Users/Adam/Desktop/RTwitter")
data <-read.csv("TweetDataSTM.csv")
processed <- textProcessor(data$documents, metadata = data)
out <- prepDocuments(processed$documents, processed$vocab, processed$meta)
docs <- out$documents
vocab <- out$vocab
meta <-out$meta
> library(stm)
> library(igraph)
> setwd("c:/Users/Adam/Desktop/RTwitter")
>
> rm(list=ls())
>
> data <-read.csv("TweetDataSTM.csv")
> processed <- textProcessor(data$documents, metadata = data)
Building corpus...
Error: is.Source(s) is not TRUE
In addition: Warning message:
In is.Source(s) : vectorized sources must have a positive length entry
> out <- prepDocuments(processed$documents, processed$vocab, processed$meta)
Error in prepDocuments(processed$documents, processed$vocab, processed$meta) :
object 'processed' not found
> docs <- out$documents
Error: object 'out' not found
> vocab <- out$vocab
Error: object 'out' not found
> meta <-out$meta
Error: object 'out' not found
(非常感谢任何建议!)
- 亚当
答案 0 :(得分:0)
我认为您的错误发生是因为您的列分别命名为vocab
和meta
。但是在这里
已处理<-textProcessor(data $ documents,元数据=数据)
您正在尝试调用documents
列(据我所知)在data.frame中不存在。尝试将代码更改为:
processed <- textProcessor(data$vocab, metadata = data)