根据列表在表格中找出单词频率

时间:2016-01-16 16:31:36

标签: r list

现在我有一个dtm,所以我将dtm作为频率表

freqs <- as.data.frame(inspect(dtm1))

这里有freqs的样子,它包含一行显示文档中这些单词的频率

I    really   hate   school   how   can  are  you  hi
4      5        3       2      3     1    4    5   1

我有一个清单

list <- c("hi", "how", "are", "you")

如何根据列表找出频率表中的单词频率,然后在表格中编译这些单词频率

hi  how  are  you
1    3    4   5

2 个答案:

答案 0 :(得分:1)

如果单词是data.frame

中的变量名称
> freqs[,list]
  hi how are you
1  1   3   4   5

答案 1 :(得分:0)

您可以通过两种方式执行此操作:

使用table()

words <- "hi how are you doing today I really hate school and I want to quit how can you still go to school"
lst <- c("hi", "how", "are", "you")
table(strsplit(words, split=" "))[lst]
hi how are you 
1   2   1   2 

使用data.frame()

df <- as.data.frame(table(strsplit(words,split=" ")))
colnames(df) <- c("words","freqs")
df[df$words%in%lst,]
   words freqs
2    are     1
7     hi     1
8    how     2
17   you     2