这是包含许多行的表,但为了简化问题...
dt1 <-data.frame(col1=c("C,Y,M","B,C,M,A"),col2=c("B,E,M","B,A,G"),col3=c("2","10"))
col1 col2 col3
1 C,Y,M B,E,M 2
2 B,C,M,A B,F,G 10
所以我想做的是
1。每列的每个字符串都应该配对,但是如果有一个公共字符串忽略它,例如C与B,C与E而不是C与M,因为M在那里该行的两列和类似的Y与B,Y与E再次不与M。
2. 他们的对应值为col3
输出表
dt2 <- data.frame(col1 =c("C","C","Y","Y","C","C","M","M","A","A"),col2 = c("B","E","B","E","F","G","F","G","F","G"),col3=c("2","2","2","2","10","10","10","10","10","10"))
col1 col2 col3
1 C B 2
2 C E 2
3 Y B 2
4 Y E 2
5 C F 10
6 C G 10
7 M F 10
8 M G 10
9 A F 10
10 A G 10
答案 0 :(得分:2)
也许您可以尝试这样的事情(请注意您的样本数据中存在错误....):
dt1 <- data.frame(col1 = c("C,Y,M","B,C,M,A"),
col2 = c("B,E,M","B,F,G"),
col3 = c("2","10"))
x <- lapply(dt1, function(x) strsplit(as.character(x), ",", TRUE))
myFun <- function(x, y, z) {
drop <- intersect(x, y)
expand.grid(x[!x %in% drop], y[!y %in% drop], z)
}
do.call(rbind, Map(myFun, x[[1]], x[[2]], x[[3]]))
# Var1 Var2 Var3
# 1 C B 2
# 2 Y B 2
# 3 C E 2
# 4 Y E 2
# 5 C F 10
# 6 M F 10
# 7 A F 10
# 8 C G 10
# 9 M G 10
# 10 A G 10