如何在R中合并具有几个相同ID的两个数据框

时间:2020-10-21 08:06:10

标签: r merge

我试图合并两个data.frames,就像代码中的一样。主要问题是我有几行具有相同的ID,但是我想将所有值(阶段)与df1中的ID合并。 我试图搜索类似的问题,但找不到任何问题。

df1<-as.data.frame(c("a","a","a","a","a","c","c","c","b","b"))
colnames(df1)<-c("ID")
df2<-data.frame(c("a","a","a","a","a","b","b"),c(1,1,0,0,1,1,-1))
colnames(df2)<-c("ID","phase")

output<-cbind(c("a","a","a","a","a","c","c","c","b","b"),c(1,1,0,0,1,NA,NA,NA,1,-1))

我尝试使用merge(),但是它导致data.frame比预期的输出大得多。此外,我还丢失了应该与“ c”合并的NA。

merge_out<-merge(df1,df2[,c("ID","phase")],by="ID")

ID phase
a     1
a     1
a     0
a     0
a     1
a     1
a     1
a     0
a     0
a     1
a     1
a     1
a     0
a     0
a     1
a     1
a     1
a     0
a     0
a     1
a     1
a     1
a     0
a     0
a     1
b     1
b    -1
b     1
b    -1

有什么想法吗?谢谢!

2 个答案:

答案 0 :(得分:1)

也许您正在寻找pmatch

cbind(df1, phase=df2$phase[pmatch(df1$ID, df2$ID)])
#cbind(df1, df2[pmatch(df1$ID, df2$ID), "phase", drop = FALSE]) #Alternative
#   ID phase
#1   a     1
#2   a     1
#3   a     0
#4   a     0
#5   a     1
#6   c    NA
#7   c    NA
#8   c    NA
#9   b     1
#10  b    -1

答案 1 :(得分:0)

这项工作:

library(dplyr)
> df1 %>% group_by(ID) %>%  mutate(uid = paste0(row_number(), ID)) %>% left_join(
+ df2%>% group_by(ID) %>% mutate(uid = paste0(row_number(), ID))
+ ) %>% select(-uid)
Joining, by = c("ID", "uid")
# A tibble: 10 x 2
# Groups:   ID [3]
   ID    phase
   <chr> <dbl>
 1 a         1
 2 a         1
 3 a         0
 4 a         0
 5 a         1
 6 c        NA
 7 c        NA
 8 c        NA
 9 b         1
10 b        -1