如何找到列中的单词/单词出现在包含句子的另一列中

时间:2017-08-02 13:05:21

标签: r text

我期待一个R解决方案,可以检查数据框的句子(第2列)中是否存在单词或句子(第1列)。如果单词/单词出现在句子中,则它应返回1(TRUE)或0(FALSE)。 This is how my DF Looks nowThis is how it should look like

2 个答案:

答案 0 :(得分:1)

这应该适合你:

df[, "lookup"] <- gsub(" ", "|", df[,"substring"])
df[,"t"] <- mapply(grepl, df[,"lookup"], df[,"string"])

df
#                 substring                 string                   lookup     t
#1             my new phone this is a mobile phone             my|new|phone  TRUE
#2 She would buy new phones Yes, I have two phones She|would|buy|new|phones  TRUE
#3            telephonessss       my old telephone            telephonessss FALSE
#4             telephone234           telephone234             telephone234  TRUE

你可以通过创建查阅列来获得更多的幻想,但是对于这种情况没有必要,所以我使用了一个简单的gsub

数据:

df <- data.frame(substring = c("my new phone", "She would buy new phones", "telephonessss", "telephone234"),
                 string = c("this is a mobile phone", "Yes, I have two phones", "my old telephone", "telephone234"))

答案 1 :(得分:1)

或使用dplyr&amp; stringr解决方案。但原则上它也有同样的想法:

library(tidyverse)
library(stringr)
df %>% 
  mutate(result=str_detect(df$string,gsub(" ", "|", df$substring)))
                 substring                 string result
1             my new phone this is a mobile phone   TRUE
2 She would buy new phones Yes, I have two phones   TRUE
3            telephonessss       my old telephone  FALSE
4             telephone234           telephone234   TRUE