我有2个数据框:
df1
Column1 Column2
A id1
B id2
C id3
B id2
D id4
A id1
C id3
df2
Column1 Column2 Column3
X m1 m2
A m3 m4
A m3 m4
Y n1 n2
A m3 m4
Z p1 p2
X m1 m2
我希望根据以下条件合并df1
和df2
,如果df1
的第1列中的行是A
,则应该有选择地合并第2列和第3列基于df2
df1
所以最终的df1看起来像这样:
df1
Column1 Column2.1 Column1.2 Column2.2 Column3.2
A id1 id1 m3 m4
B id2
C id3
B id2
D id4
A id1 id1 m3 m4
C id3
到目前为止,我通过专门提取df1第1列中包含“A”的行来管理它。然后我在for循环中应用了一个合并来获取df2
的两列。是否有可能使用if循环来帮助专门执行df1
和df2
之间的条件合并?
以下是df1
和df2
的结构:
df1 <- structure(list(Column1 = structure(c(1L, 2L, 3L, 2L, 4L, 1L,
3L), .Label = c("A", "B", "C", "D"), class = "factor"), Column2 = structure(c(1L,
2L, 3L, 2L, 4L, 1L, 3L), .Label = c("id1", "id2", "id3", "id4"
), class = "factor")), .Names = c("Column1", "Column2"), class = "data.frame", row.names = c(NA,
-7L))
df2 <- structure(list(Column1 = structure(c(2L, 1L, 1L, 3L, 1L, 4L,
2L), .Label = c("A", "X", "Y", "Z"), class = "factor"), Column2 = structure(c(1L,
2L, 2L, 3L, 2L, 4L, 1L), .Label = c("m1", "m3", "n1", "p1"), class = "factor"),
Column3 = structure(c(1L, 2L, 2L, 3L, 2L, 4L, 1L), .Label = c("m2",
"m4", "n2", "p2"), class = "factor")), .Names = c("Column1",
"Column2", "Column3"), class = "data.frame", row.names = c(NA,
-7L))
答案 0 :(得分:0)
如果df1和df2定义如上所述
library(sqldf)
final<-sqldf("select df1.Column1 as Column1 ,df1.Column2,(Select distinct df2.Column2 from df2 where df2.Column1=df1.Column1) as Column2_2,(Select distinct df2.Column3 from df2 where df2.Column1=df1.Column1)as Column3_2 from df1 left join df2 on df1.Column1=df2.Column2")
Column1.2<-ifelse(final$Column1=="A",final$Column2,NA)
final<-cbind(final,Column1.2)