按条件合并多列

时间:2017-09-11 09:47:15

标签: r merge data.table mapping data-manipulation

我想按条件合并我的数据。我浏览了很多网站,但没找到我想要的东西 我这里有两个数据:

# dt1
ColA1   ColA2     ColB    ColC    ColD           Area
TA43    TI44      S2230   Amy     2014-08-08     USA
TA63    TI64      T1205   Andy    2014-01-01     CANADA
TA28    TI100     L1288   Peter   2014-01-08     EU
TA28    TI100     L2231   Roger   2014-01-08     EU
TA92    NA        A2206   Jean    2014-01-12     China
TA14    NA        E2240   Freda   2014-01-05     Japan
TA69    TI50      N1029   Tina    2014-01-05     Mexico

# dt2
ColA     ColB    ColC    ColD           TYPE
TI64     T1205   Andy    2014-01-01     I
TI100    L1288   Peter   2014-01-08     I
TI100    L2231   Roger   2014-01-08     I
TA92     A2206   Jean    2014-01-12     A 
TA14     E2240   Freda   2014-01-05     R
TA69     N1029   Tina    2014-01-05     A

我想要的是:

ColA     ColB    ColC    ColD           TYPE   Area
TI64     T1205   Andy    2014-01-01     I      CANADA
TI100    L1288   Peter   2014-01-08     I      EU
TI100    L2231   Roger   2014-01-08     I      EU
TA92     A2206   Jean    2014-01-12     A      China
TA14     E2240   Freda   2014-01-05     R      Japan
TA69     N1029   Tina    2014-01-05     A      Mexico

我在这里解释一下:
我想通过ColA,ColB,ColC和ColD将dt1映射到dt2 如果TYPE中的dt2 A R ,则ColA中的dt2合并{{1}在ColA1中。} 如果dt1中的TYPE,则dt2中的ColAdt2ColA2的{​​{1}}合并。

dt1方式的任何想法?

1 个答案:

答案 0 :(得分:1)

这将获得dplyr所需的输出。您也可以使用inner_joinright_join,具体取决于您要实现的目标:

library(dplyr)
library(tidyr)
dt2 %>% mutate(merge_col = ifelse(TYPE == "I","ColA2","ColA1")) %>%
  left_join(dt1 %>% gather(merge_col,ColA,ColA1,ColA2))

# Joining, by = c("ColA", "ColB", "ColC", "merge_col")
# ColA  ColB  ColC       ColD TYPE merge_col   Area
# 1  TI64 T1205  Andy 2014-01-01    I     ColA2 CANADA
# 2 TI100 L1288 Peter 2014-01-08    I     ColA2     EU
# 3 TI100 L2231 Roger 2014-01-08    I     ColA2     EU
# 4  TA92 A2206  Jean 2014-01-12    A     ColA1  China
# 5  TA14 E2240 Freda 2014-01-05    R     ColA1  Japan
# 6  TA69 N1029  Tina 2014-01-05    A     ColA1 Mexico

<强> data.table

使用data.table你可以尝试这个,它是确切的翻译:

merge(
  dt2[,merge_col := ifelse(dt2$TYPE == "I","ColA2","ColA1")],
  melt(dt1,id = c("ColB","ColC","ColD","Area"),measure=c("ColA1","ColA2"),"merge_col","ColA"),
  all.x = TRUE
)

根据您想要的联接类型调整参数all.xall.y