如何通过将合并与所需的输出结合使用,在R中合并两个数据帧?

时间:2019-06-21 16:06:42

标签: r merge

我有两个数据框:

d1.Kids <- c("Jack",    "Jill", "Jillian",  "John", "James")
d1.States   <- c("CA",  "MA",   "DE",   "HI",   "PA")

d1 <- data.frame(d1.Kids, d1.States)

d1

   d1.Kids d1.States
1    Jack        CA
2    Jill        MA
3 Jillian        DE
4    John        HI
5   James        PA

d2.Ages <- c(10, 7, 12, 30)
d2.Kids <- c("Jill", "Jillian", "Jack", "Mary")

d2 <- data.frame(d2.Kids, d2.Ages)
d2
   d2.Kids d2.Ages
1    Jill      10
2 Jillian       7
3    Jack      12
4    Mary      30

当我合并这两个数据框时,我得到以下结果:

merge(d1,d2)

结果:

 d1.Kids d1.States d2.Kids d2.Ages
1     Jack        CA    Jill      10
2     Jill        MA    Jill      10
3  Jillian        DE    Jill      10
4     John        HI    Jill      10
5    James        PA    Jill      10
6     Jack        CA Jillian       7
7     Jill        MA Jillian       7
8  Jillian        DE Jillian       7
9     John        HI Jillian       7
10   James        PA Jillian       7
11    Jack        CA    Jack      12
12    Jill        MA    Jack      12
13 Jillian        DE    Jack      12
14    John        HI    Jack      12
15   James        PA    Jack      12
16    Jack        CA    Mary      30
17    Jill        MA    Mary      30
18 Jillian        DE    Mary      30
19    John        HI    Mary      30
20   James        PA    Mary      30

我想得到这个结果:

   kids    ages   states                    
1  jack     12     CA
2  jill     10     MA
3 jillian    7     DE
4 john      NA     HI
5 james     NA     PA
6  Mary     30     NA

1 个答案:

答案 0 :(得分:1)

如果不使用by,它将进行交叉连接,我们可以使用by选项来避免这种情况。由于两列的列名都不相同,因此请使用by.xby.y并使用all = TRUE

进行完全连接
out <- merge(d1,d2, by.x = 'd1.Kids', by.y = 'd2.Kids', all = TRUE)

并通过删除前缀部分更改'out'的名称

names(out) <- sub("^[^.]+\\.", "", names(out))