我有2个像这样的数据帧,我想以不同的方式合并它们
A <- c("CC1_PH","CC1_PH","CC1_PH","CC2_PH","CC2_PH","CC2_PH")
B <- c ("MEAS_Length","MEAS_Breadth","MEAS_Height","MEAS_Breadth","MEAS_Height","MEAS_Length")
df1 <- data.frame(A,B)
A <- c("CC1_PH","CC1_PH","CC2_PH","CC2_PH")
B <- c ("*","MEAS_Breadth","*","MEAS_Height")
EmpID <- c(444452,16822,339862,14828)
ManagerID <- c(11499,11499,11669,11669)
df2 <- data.frame(A,B,EmpID,ManagerID)
然后我合并这两个数据帧
df <- merge(df1,df2,by=c("A","B"),all.x=TRUE)
A B EmpID ManagerID
1 CC1_PH MEAS_Breadth 16822 11499
2 CC1_PH MEAS_Height NA NA
3 CC1_PH MEAS_Length NA NA
4 CC2_PH MEAS_Breadth NA NA
5 CC2_PH MEAS_Height 14828 11669
6 CC2_PH MEAS_Length NA NA
我想要的输出是
A B EmpID ManagerID
1 CC1_PH MEAS_Breadth 16822 11499
2 CC1_PH MEAS_Height 444452 11499
3 CC1_PH MEAS_Length 444452 11499
4 CC2_PH MEAS_Breadth 339862 11669
5 CC2_PH MEAS_Height 14828 11669
6 CC2_PH MEAS_Length 339862 11669
我想用“星号”的相应值替换NA。如何设置一个条件,当它看到“星号”时,它会返回“A”列中值的相应ManagerID和EmpID?
请提供一些有关我们如何实现这一目标的指示。
答案 0 :(得分:1)
我们根据&#39; B&#39;中*
的出现创建逻辑索引。 &#39; df2&#39;的列(&#39; i1&#39;)和&#39; EmpID&#39;中的NA值&#39; df&#39; (&#39; I2&#39)。循环遍历&lt; df&#39;中的相应列和&#39; df2&#39; (第3和第4),我们match
&#39; A&#39;专栏&#39; df&#39;和&#39; df2&#39;,使用&#39; i1和&#39; i2&#39;,替换&#39; df&#39;在&#39; df2&#39;中具有相应列值的列。将输出分配回&#39; df&#39;中的第3和第4列。
i1 <- df2$B=='*'
i2 <- is.na(df$EmpID)
df[3:4] <- Map(function(x,y) {x[i2] <- y[i1][match(df$A,
df2$A[i1])][i2]
x}, df[3:4], df2[3:4])
df
# A B EmpID ManagerID
#1 CC1_PH MEAS_Breadth 16822 11499
#2 CC1_PH MEAS_Height 444452 11499
#3 CC1_PH MEAS_Length 444452 11499
#4 CC2_PH MEAS_Breadth 339862 11669
#5 CC2_PH MEAS_Height 14828 11669
#6 CC2_PH MEAS_Length 339862 11669