现在我有一个dtm,所以我将dtm作为频率表
freqs <- as.data.frame(inspect(dtm1))
这里有freqs的样子,它包含一行显示文档中这些单词的频率
I really hate school how can are you hi
4 5 3 2 3 1 4 5 1
我有一个清单
list <- c("hi", "how", "are", "you")
如何根据列表找出频率表中的单词频率,然后在表格中编译这些单词频率
hi how are you
1 3 4 5
答案 0 :(得分:1)
如果单词是data.frame
> freqs[,list]
hi how are you
1 1 3 4 5
答案 1 :(得分:0)
您可以通过两种方式执行此操作:
table()
:words <- "hi how are you doing today I really hate school and I want to quit how can you still go to school"
lst <- c("hi", "how", "are", "you")
table(strsplit(words, split=" "))[lst]
hi how are you
1 2 1 2
data.frame()
:df <- as.data.frame(table(strsplit(words,split=" ")))
colnames(df) <- c("words","freqs")
df[df$words%in%lst,]
words freqs
2 are 1
7 hi 1
8 how 2
17 you 2