我有两个数据集。它们中的每一个都有变量ID,Block和RT(反应时间)。我想合并/列绑定两个集合,以便我有一个包含变量的数据集:ID,Block,RT1,RT2。问题是两组中的行数不等。此外,ID和块号匹配也很重要。丢失的值应替换为NA。所以我有:
head(blok1, 10)
ID Blok RT1
1 1 1 592
2 1 1 468
3 1 1 530
4 1 1 546
5 1 1 452
6 1 1 483
7 1 2 499
8 1 2 452
9 1 2 608
10 1 2 530
head(blok2, 10)
ID Blok RT2
1 1 1 592
2 1 1 920
3 1 1 686
4 1 1 561
5 1 1 561
6 1 2 327
7 1 2 686
8 1 2 670
9 1 2 702
10 1 3 920
我想要的是:
ID Blok RT1 RT2
1 1 1 592 592
2 1 1 468 920
3 1 1 530 686
4 1 1 546 561
5 1 1 452 561
6 1 1 483 NA
7 1 2 499 327
8 1 2 452 686
9 1 2 608 670
10 1 2 530 702
等
答案 0 :(得分:0)
这是使用dplyr的解决方案,也使用索引或唯一ID:
blok1 <- data.frame(ID = c(1, 1, 2), RT1 = c(11, 12, 13))
blok2 <- data.frame(ID = c(1, 2, 2), RT2 = c(21, 22, 23))
library(dplyr)
## if you want NAs for RT2 only
blok1 %>%
mutate(uID = row_number()) %>%
left_join(blok2 %>% mutate(uID = row_number()), by = c("uID", "ID"))
# uID ID RT1 RT2
# 1 1 1 11 21
# 2 2 1 12 NA
# 3 3 2 13 23
## if you want NAs for both RT1 and RT2
blok1 %>%
mutate(uID = row_number()) %>%
outer_join(blok2 %>% mutate(uID = row_number()), by = c("uID", "ID"))
# uID ID RT1 RT2
# 1 1 1 11 21
# 2 2 1 12 NA
# 3 3 2 13 23
# 4 2 2 NA 22