左连接满足条件的LHS行上的两个表,其他保持为NA

时间:2018-09-20 23:45:17

标签: r

请考虑以下情形:

test <- data.frame(Id1  = c(1, 2, 3, 4, 5, 10, 11), 
                   Id2  = c(3, 4, 10, 11, 12, 15, 9), 
                   Type = c(1, 1, 1, 2, 2, 2, 1) )
test
#>   Id1 Id2 Type
#> 1   1   3    1
#> 2   2   4    1
#> 3   3  10    1
#> 4   4  11    2
#> 5   5  12    2
#> 6  10  15    2
#> 7  11   9    1

仅当test具有特定值时,例如,我想通过Id2 = Id1自己加入TypeType == 1以这种方式获得以下结果:

#>   Id1 Id2 Type.x Id2.y Type.y
#> 1   1   3    1      10      1   # matches row 3
#> 2   2   4    1      11      2   # matches row 4
#> 3   3  10    1      15      2   # matches row 6
#> 4   4  11    2      NA     NA   # matches row 7 but Type != 1
#> 5   5  12    2      NA     NA   # Type !=1
#> 6   10 15    2      NA     NA   # Type !=1
#> 7   11  9    1      NA     NA   # Type == 1 but no matches

由于在这种情况下,test代表层次结构,所以这种类型的联接将使我能够“扩展”层次结构,以便最终每一行都以Id2终止,而该Id1不等于值{{1}}。

如何实现这种联接?

1 个答案:

答案 0 :(得分:0)

Tidyverse是用于数据处理的出色软件包。在这里,我们可以执行以下操作:

library(tidyverse)
joined <- test %>% left_join(test %>% filter(Type==1), by = c("Id1" = "Id2"))
joined

更新:

library(tidyverse)
joined <- test %>%
  filter(Type==1) %>% left_join(test, by = c("Id2" = "Id1")) %>%
  bind_rows(test %>% filter(Type==2) %>% rename(Type.x = Type))
joined


Id1 Id2 Type.x Id2.y Type.y
1   3   1   10  1
2   4   1   11  2
3   10  1   15  2
11  9   1   NA  NA
4   11  2   NA  NA
5   12  2   NA  NA
10  15  2   NA  NA