在R中的列中合并具有不同值的2个数据帧

时间:2016-02-17 17:17:33

标签: r merge dataframe

我有2个像这样的数据帧,我想以不同的方式合并它们

A <- c("CC1_PH","CC1_PH","CC1_PH","CC2_PH","CC2_PH","CC2_PH")
B <- c ("MEAS_Length","MEAS_Breadth","MEAS_Height","MEAS_Breadth","MEAS_Height","MEAS_Length")    
df1 <- data.frame(A,B)

A <- c("CC1_PH","CC1_PH","CC2_PH","CC2_PH")
B <- c ("*","MEAS_Breadth","*","MEAS_Height")
EmpID <- c(444452,16822,339862,14828)
ManagerID <- c(11499,11499,11669,11669)
df2 <- data.frame(A,B,EmpID,ManagerID)

然后我合并这两个数据帧

df <- merge(df1,df2,by=c("A","B"),all.x=TRUE)

       A            B EmpID ManagerID
1 CC1_PH MEAS_Breadth 16822     11499
2 CC1_PH  MEAS_Height    NA        NA
3 CC1_PH  MEAS_Length    NA        NA
4 CC2_PH MEAS_Breadth    NA        NA
5 CC2_PH  MEAS_Height 14828     11669
6 CC2_PH  MEAS_Length    NA        NA

我想要的输出是

       A            B EmpID ManagerID
1 CC1_PH MEAS_Breadth 16822     11499
2 CC1_PH  MEAS_Height 444452    11499
3 CC1_PH  MEAS_Length 444452    11499
4 CC2_PH MEAS_Breadth 339862    11669
5 CC2_PH  MEAS_Height 14828     11669
6 CC2_PH  MEAS_Length 339862    11669

我想用“星号”的相应值替换NA。如何设置一个条件,当它看到“星号”时,它会返回“A”列中值的相应ManagerID和EmpID?

请提供一些有关我们如何实现这一目标的指示。

1 个答案:

答案 0 :(得分:1)

我们根据&#39; B&#39;中*的出现创建逻辑索引。 &#39; df2&#39;的列(&#39; i1&#39;)和&#39; EmpID&#39;中的NA值&#39; df&#39; (&#39; I2&#39)。循环遍历&lt; df&#39;中的相应列和&#39; df2&#39; (第3和第4),我们match&#39; A&#39;专栏&#39; df&#39;和&#39; df2&#39;,使用&#39; i1和&#39; i2&#39;,替换&#39; df&#39;在&#39; df2&#39;中具有相应列值的列。将输出分配回&#39; df&#39;中的第3和第4列。

i1 <-  df2$B=='*'
i2 <- is.na(df$EmpID)
df[3:4] <- Map(function(x,y) {x[i2] <- y[i1][match(df$A, 
                    df2$A[i1])][i2]
                      x}, df[3:4], df2[3:4])
df
#       A            B  EmpID ManagerID
#1 CC1_PH MEAS_Breadth  16822     11499
#2 CC1_PH  MEAS_Height 444452     11499
#3 CC1_PH  MEAS_Length 444452     11499
#4 CC2_PH MEAS_Breadth 339862     11669
#5 CC2_PH  MEAS_Height  14828     11669
#6 CC2_PH  MEAS_Length 339862     11669