使用dplyr基于多个条件联接数据框

时间:2020-09-18 19:14:04

标签: r join dplyr

我有两个数据框

DF1

Name   Industry     Division     Job
Billy  Cameras      Finance      Analyst
Jane   Cameras      Finance      Scientist
Marge  Lightening   Operations   Analyst

DF2

Industry     Division        Job_        Rate
Cameras      Finance         Analyst     45
Cameras      Finance         Scientist   24
Cameras      Operations      Analyst     23
Cameras      Operations      Scientist   41
Lightening   Operations      Analyst     10
Lightening   Finance         Analyst     101

所以我想将DF2加入DF1,这很简单,因为这仅取决于行业,部门和工作是否常见。但是我该如何使用不同的var名称呢?

DF1 %>% 
  left_join(DF2, by = c('Industry', 'Division', 'Job`))

所以我需要让Job转到Job_。我无法重命名。

最终结果:

Name   Industry     Division     Job         Rate
Billy  Cameras      Finance      Analyst     45
Jane   Cameras      Finance      Scientist   24
Marge  Lightening   Operations   Analyst     10

1 个答案:

答案 0 :(得分:2)

如果有任何差异,我们可以使用=来指定每个数据集的名称

library(dplyr) 
DF1 %>% 
      left_join(DF2, by = c('Industry', 'Division', 'Job' = 'Job_'))
#   Name   Industry   Division       Job Rate
#1 Billy    Cameras    Finance   Analyst   45
#2  Jane    Cameras    Finance Scientist   24
#3 Marge Lightening Operations   Analyst   10

base R merge中,有by.xby.y用于指定数据集中不同的名称