Question

我有一个采用这种格式的数据框：

A <- c("John Smith", "Red Shirt", "Family values are better")
B <- c("John is a very highly smart guy", "We tried the tea but didn't enjoy it at all", "Family is very important as it gives you values")

df <- as.data.frame(A, B)

我的意图是将结果恢复为：

ID   A                           B
1    John Smith                  is a very highly smart guy
2    Red Shirt                   We tried the tea but didn't enjoy it at all
3    Family values are better    is very important as it gives you

我试过了：

test<-df %>% filter(sapply(1:nrow(.), function(i) grepl(A[i], B[i])))

但它并没有给我我想要的东西。

有任何建议/帮助吗？

Answer 1

一种解决方案是使用mapply和strsplit。

诀窍是将df$A拆分为单独的单词并折叠由|分隔的单词，然后将其用作pattern中的gsub替换为"" }。

lst <- strsplit(df$A, split = " ")

df$B <- mapply(function(x,y){gsub(paste0(x,collapse = "|"), "",df$B[y])},lst,1:length(lst))
df
# A                                           B
# 1               John Smith                  is a very highly smart guy
# 2                Red Shirt We tried the tea but didn't enjoy it at all
# 3 Family values are better          is very important as it gives you

另一种选择是：

df$B <- mapply(function(x,y)gsub(x,"",y) ,gsub(" ", "|",df$A),df$B)

数据：

A <- c("John Smith", "Red Shirt", "Family values are better") B <- c("John is a very highly smart guy", "We tried the tea but didn't enjoy it at all", "Family is very important as it gives you values") df <- data.frame(A, B, stringsAsFactors = FALSE)

Answer 2

Just another option using stringr::str_split_fixed function:

library(stringr)

str_split_fixed(sapply(paste(df$A,df$B, sep=" columnbreaker "), 
                function(i){
                            paste(unique(
                                         strsplit(as.character(i), split=" ")[[1]]), 
                         collapse = " ")}), 
                 " columnbreaker ", 2)


#       [,1]                       [,2]                                         
# [1,] "John Smith"               "is a very highly smart guy"                 
# [2,] "Red Shirt"                "We tried the tea but didn't enjoy it at all"
# [3,] "Family values are better" "is very important as it gives you"

删除R

2 个答案: