输入：

DF1：

ColA
text1
text2
text3
text4
text5
text6
text7

DF2：

ColA
text1 text2 text12
text23 text22 text7

中间产出：

ColA                    ColB
text1 text2 text12     text1, text2
text23 text22 text7    text7

最终输出：

ColA                ColB
text1 text2 text12   text1
text1 text2 text12   text2
text23 text22 text7  text7

方法：

我目前正在使用

test$test <- sapply(df2$ColA, function(x) ifelse(grep(paste(as.character(unlist(df1$ColA)),collapse="|"),x),1,0))

如果df1 $ ColA字符串与df2 $ ColA匹配但是不会返回匹配的字符串，它会给我。请指教。

Answer 1

这可能会对您有所帮助：

df2 <- matrix(sample(LETTERS)[-1], nrow=5)
df2 <- apply(df2, 1, FUN=function(x) paste(x, collapse=' '))

data <- data.frame(a=LETTERS[1:5], b=df2) ; data

df2 <- sapply(1:nrow(data), function(x) strsplit(as.character(data$b[x]), ' '))

sapply(1:nrow(data), function(x) which(data$a[x] == df2[[x]]))

sapply(1:nrow(data), function(x) data$a[x] == df2[[x]])

Answer 2

这是一个基于match()的半矢量化解决方案，该解决方案应该快速生成您正在寻找的内容。匹配df1$ColA中的项目的方法是将df2$ColA标记为df1$ColA并将df2$ColA与每个标记匹配。然后，它会构建整个（原始）df1$ColA元素的重复，并在输出中将ColB匹配添加为# set up the data, which the OP should have done df1 <- data.frame(ColA = paste0("text", 1:7), stringsAsFactors = FALSE) df2 <- data.frame(ColA = c("text1 text2 text12", "text23 text22 text7"), stringsAsFactors = FALSE) # create a matrix of matches of first to elements of second matmatrix <- sapply(strsplit(df2$ColA, " "), match, df1$ColA) # repeat original text in same length as potential match origdfColArep <- rep(df2$ColA, each = nrow(matmatrix)) # create the results dataset, first the matches of the second part result <- data.frame(ColA = origdfColArep[!is.na(as.vector(matmatrix))], stringsAsFactors = FALSE) # then add the matching first part result$ColB <- df1$ColA[na.omit(as.vector(matmatrix))] result ## ColA ColB ## 1 text1 text2 text12 text1 ## 2 text1 text2 text12 text2 ## 3 text23 text22 text7 text7。

position: absolute;

R中列表的字符串匹配

输入：

DF1：

DF2：

中间产出：

最终输出：

方法：

2 个答案: