我有以下内容:
text <- c('I am a human','It is an animal and not a human, I am a human','Cant think of something else to write','and and is am')
words <- c('and','am','is')
我想计算文本中这些单词出现次数的总和。所以输出应该如下:
[1] 1 3 0 4
我使用的代码显然不是最优雅的代码:
TotalCount <- vector(mode='integer',length = 4)
for (ii in 1:4){
for(jj in 1:3){
wordCount <- str_count(text[ii],words[jj])
TotalCount[ii] <- wordCount + TotalCount[ii]
}
}
是否有更高效,更好的方式来做到这一点?
答案 0 :(得分:1)
您可以使用str_count
库中的stringr
功能。
library(stringr)
text <- c('I am a human','It is an animal and not a human, I am a human','Cant think of something else to write','and and is am')
words <- c('and','am','is')
str_count(text, paste(words, collapse="|"))
# [1] 1 3 0 4
或
str_count(text, paste0(c("\\b("),paste(words,collapse="|"),c(")\\b")))