考虑以下代码,产生以下数据框
df1 <- data.frame("ID"=c("A", "A", "A", "A", "A", "B", "B", 'B', "B", "B"),
"X_A"=c(1,2,3,4,5,NA, NA, 8, 9,10), "X_B"=c(1,2,3,4,5,NA,NA, 8,9,10)
,"Y_A"=c(1,2,NA,NA, 10, 8,9,10,NA,NA), "Y_B"=c(1,2,NA, NA, 10,8,
9, 10, NA, NA))
它会产生以下数据框
ID X_A X_B Y_A Y_B
1 A 1 1 1 1
2 A 2 2 2 2
3 A 3 3 NA NA
4 A 4 4 NA NA
5 A 5 5 NA NA
6 B NA NA 8 8
7 B NA NA 9 9
8 B 8 8 10 10
9 B 9 9 NA NA
10 B 10 10 NA NA
我希望将数据从此数据帧传输到df2
ID X_A Y_A
1 A 1 1
2 A 2 2
3 A 3 3
4 A 4 4
5 A 5 5
6 A 6 6
7 A 7 7
8 A 8 8
9 A 9 9
10 A 10 10
11 B 1 1
12 B 2 2
13 B 3 3
14 B 4 4
15 B 5 5
16 B 6 6
17 B 7 7
18 B 8 8
19 B 9 9
20 B 10 10
最终数据帧应该像这样
ID X_A Y_A X_B Y_B
1 A 1 1 1 1
2 A 2 2 2 2
3 A 3 3 3 NA
4 A 4 4 4 NA
5 A 5 5 5 NA
6 A 6 6 NA NA
7 A 7 7 NA NA
8 A 8 8 NA NA
9 A 9 9 NA NA
10 A 10 10 NA NA
11 B 1 1 NA NA
12 B 2 2 NA NA
13 B 3 3 NA NA
14 B 4 4 NA NA
15 B 5 5 NA NA
16 B 6 6 NA NA
17 B 7 7 NA NA
18 B 8 8 8 8
19 B 9 9 9 9
20 B 10 10 10 10
最终输出类似于vlookup的结果,其中df1和df2的ID和X_A,ID和Y_A列匹配,以便在df2中填充X_B和Y_B的对应值。如果没有匹配,则应得出NA。我尝试了以下代码
merge(df1, df2).
但这会减慢我的系统速度。我也尝试过
library(dplyr)
df2 %>% right_join(df1, by=c(ID, x_A, y_A).
这导致出现所有行。可以在R中管理预期的输出吗?请求某人帮助
答案 0 :(得分:1)
您是说先在ID和X_A上加入一次以获得X_B,然后在ID和Y_A上加入以获得Y_B?请注意,第10行是不同的:
df2 %>%
left_join(select(df1, ID, X_A, X_B),
by = c("ID", "X_A")) %>%
left_join(select(df1, ID, Y_A, Y_B),
by = c("ID", "Y_A"))
# ID X_A Y_A X_B Y_B
# 1 A 1 1 1 1
# 2 A 2 2 2 2
# 3 A 3 3 3 NA
# 4 A 4 4 4 NA
# 5 A 5 5 5 NA
# 6 A 6 6 NA NA
# 7 A 7 7 NA NA
# 8 A 8 8 NA NA
# 9 A 9 9 NA NA
# 10 A 10 10 NA 10
# 11 B 1 1 NA NA
# 12 B 2 2 NA NA
# 13 B 3 3 NA NA
# 14 B 4 4 NA NA
# 15 B 5 5 NA NA
# 16 B 6 6 NA NA
# 17 B 7 7 NA NA
# 18 B 8 8 8 8
# 19 B 9 9 9 9
# 20 B 10 10 10 10
基本R:
want <- merge(df2, subset(df1, select = c(ID, X_A, X_B)), by = c("ID", "X_A"), all.x = TRUE)
(want <- merge(want, subset(df1, select = c(ID, Y_A, Y_B)), by = c("ID", "Y_A"), all.x = TRUE))