我有2个数据帧。一个包含像这样的搜索短语
search.phrases 1 the quick 2 brown fox jumps 3 over the lazy 5 dog 6 why 7 nobody knows ...
和另一个包含关键字
keywords 1 quick 2 lazy 3 dog 4 knows ...
理想情况下,我想找到哪些搜索词组包含一个或多个(布尔值或计数)这样的关键词
search.phrases keyword.found 1 the quick TRUE 2 brown fox jumps FALSE 3 over the lazy TRUE 5 dog TRUE 6 why FALSE 7 nobody knows TRUE ...
我已经尝试了一段时间,但我很难过。非常感谢任何帮助。
很多爱情
G
答案 0 :(得分:3)
您可以使用grepl()
rgx <- paste(as.character(df2$keywords), collapse = "|")
df$keyword.found <- grepl(rgx, df$search.phrases)
<强>结果:强>
search.phrases keyword.found
1 the quick TRUE
2 brown fox jumps FALSE
3 over the lazy TRUE
5 dog TRUE
6 why FALSE
7 nobody knows TRUE
数据:强>
df2 <- structure(list(keywords = structure(c(4L, 3L, 1L, 2L), .Label = c("dog",
"knows", "lazy", "quick"), class = "factor")), .Names = "keywords", class = "data.frame", row.names = c("1",
"2", "3", "4"))
df <- structure(list(search.phrases = structure(c(5L, 1L, 4L, 2L, 6L,
3L), .Label = c("brown fox jumps", "dog", "nobody knows", "over the lazy",
"the quick", "why"), class = "factor")), .Names = "search.phrases", class = "data.frame", row.names = c("1",
"2", "3", "5", "6", "7"))
答案 1 :(得分:1)
c("the quick fox", "had a dog", "named bruce") -> phrases
c("quick", "bruce") -> keywords
library(stringr)
str_split(phrases, " ") -> phrase_list
sapply(phrase_list, function(x) any(ifelse(x %in% keywords, TRUE, FALSE))) -> z