我想将语料库转换为DocumentTermMatrix,只选择单词列表。我知道"字典"控制列表中的参数执行此操作:
a = list("I am a big big big apple", "Petter Petter Peter Peter")
v = VCorpus(VectorSource(a))
my_terms = c("peter", "petter")
DocumentTermMatrix(v, control = list(dictionary = my_terms)) %>% as.matrix()
它给了我这个:
Terms
Docs peter petter
1 0 0
2 1 1
虽然我想要的是这样的:
Terms
Docs peter petter
1 0 0
2 2 2
我想知道是否有这样的函数/参数。
答案 0 :(得分:0)
工作正常:
library(magrittr)
library(tm)
a <- list("I am a big big big apple", "Petter Petter Peter Peter")
v <- VCorpus(VectorSource(a))
my_terms <- c("peter", "petter")
DocumentTermMatrix(v, control = list(dictionary = my_terms)) %>%
as.matrix()