我想按条件合并我的数据。我浏览了很多网站,但没找到我想要的东西 我这里有两个数据:
# dt1
ColA1 ColA2 ColB ColC ColD Area
TA43 TI44 S2230 Amy 2014-08-08 USA
TA63 TI64 T1205 Andy 2014-01-01 CANADA
TA28 TI100 L1288 Peter 2014-01-08 EU
TA28 TI100 L2231 Roger 2014-01-08 EU
TA92 NA A2206 Jean 2014-01-12 China
TA14 NA E2240 Freda 2014-01-05 Japan
TA69 TI50 N1029 Tina 2014-01-05 Mexico
# dt2
ColA ColB ColC ColD TYPE
TI64 T1205 Andy 2014-01-01 I
TI100 L1288 Peter 2014-01-08 I
TI100 L2231 Roger 2014-01-08 I
TA92 A2206 Jean 2014-01-12 A
TA14 E2240 Freda 2014-01-05 R
TA69 N1029 Tina 2014-01-05 A
我想要的是:
ColA ColB ColC ColD TYPE Area
TI64 T1205 Andy 2014-01-01 I CANADA
TI100 L1288 Peter 2014-01-08 I EU
TI100 L2231 Roger 2014-01-08 I EU
TA92 A2206 Jean 2014-01-12 A China
TA14 E2240 Freda 2014-01-05 R Japan
TA69 N1029 Tina 2014-01-05 A Mexico
我在这里解释一下:
我想通过ColA,ColB,ColC和ColD将dt1
映射到dt2
如果TYPE
中的dt2
列 A 且 R ,则ColA
中的dt2
合并{{1}在ColA1
中。}
如果dt1
中的TYPE
列我,则dt2
中的ColA
与dt2
中ColA2
的{{1}}合并。
dt1
方式的任何想法?
答案 0 :(得分:1)
这将获得dplyr
所需的输出。您也可以使用inner_join
或right_join
,具体取决于您要实现的目标:
library(dplyr)
library(tidyr)
dt2 %>% mutate(merge_col = ifelse(TYPE == "I","ColA2","ColA1")) %>%
left_join(dt1 %>% gather(merge_col,ColA,ColA1,ColA2))
# Joining, by = c("ColA", "ColB", "ColC", "merge_col")
# ColA ColB ColC ColD TYPE merge_col Area
# 1 TI64 T1205 Andy 2014-01-01 I ColA2 CANADA
# 2 TI100 L1288 Peter 2014-01-08 I ColA2 EU
# 3 TI100 L2231 Roger 2014-01-08 I ColA2 EU
# 4 TA92 A2206 Jean 2014-01-12 A ColA1 China
# 5 TA14 E2240 Freda 2014-01-05 R ColA1 Japan
# 6 TA69 N1029 Tina 2014-01-05 A ColA1 Mexico
<强> data.table 强>
使用data.table你可以尝试这个,它是确切的翻译:
merge(
dt2[,merge_col := ifelse(dt2$TYPE == "I","ColA2","ColA1")],
melt(dt1,id = c("ColB","ColC","ColD","Area"),measure=c("ColA1","ColA2"),"merge_col","ColA"),
all.x = TRUE
)
根据您想要的联接类型调整参数all.x
和all.y