通过两个相同的变量合并两个数据框

时间:2020-01-29 12:45:10

标签: r dplyr

我有两个数据帧(df1和df2),并希望将它们与列“ GEO”,“ POP”,“ Value”,“ Mean”合并为一个df 不匹配的行应分配“ NA”

> df1
        GEO    Value            POP
1         Belgium   986494    Adolescents
2         Denmark   542496    Adolescents
3         Finland   472801    Adolescents
4          France  6568728    Adolescents
5        Germany   6177477    Adolescents
6           Italy  4564035    Adolescents
7     Netherlands  1608971    Adolescents
8           Spain  3550102    Adolescents
9  United Kingdom  5815087    Adolescents
10        Belgium  6910856         Adults
11        Denmark  3423077         Adults
12        Finland  3318043         Adults
13         France 39536853         Adults
14       Germany  50839124         Adults
15          Italy 37609721         Adults
16    Netherlands 10467463         Adults
17          Spain 29722963         Adults
18 United Kingdom 39436511         Adults

> df2
              GEO            POP    Mean
1         Belgium    Adolescents 1221.75
2         Denmark    Adolescents 2669.66
3         Finland    Adolescents 1378.44
4          France    Adolescents 2293.82
5         Germany    Adolescents 2412.83
6           Italy    Adolescents 1282.08
7     Netherlands    Adolescents 1431.87
8           Spain    Adolescents 5410.47
9  United Kingdom    Adolescents 1026.75
10        Belgium         Adults 1567.43
11        Denmark         Adults 4241.10
12        Finland         Adults 3938.95
13         France         Adults 3231.94
14        Germany         Adults 1840.54
15          Italy         Adults 1337.15
16    Netherlands         Adults 4157.15
17          Spain         Adults 3897.04

我需要将它们合并为一个df! 我用dplyr尝试了一些功能:

bind_rows(df1,df2)
  • 问题:它们有不同的长度!所以我尝试了INTERSECT
intersect(df1, df2)

Error: not compatible: 
- Cols in y but not x: `Value`. 
- Cols in x but not y: `Mean`. 

我也试图加入

left_join(df1,df2, by "GEO", "POP")

但是这只能通过一个公共列来实现,而我将有两个列(/ GEO和POP)必须在连接过程中加以考虑。 你有主意吗?

1 个答案:

答案 0 :(得分:2)

df1 %>%
    left_join(df2, by = c("GEO", "POP"))