我有一个大数据框,如下例所示。
df <- data.frame(IND= seq(1:20), S = LETTERS[1:20], FA=c(0,0,133,0,2,2,2,0,0,4,4,4,6,6,0,0,0,4,2,8),
MO = c(77,0,77,1,3,1,1,1,0,3,1,5,5,3,0,0,100,3,5,5)
)
当FA和MO等于IND时,我需要创建两个新变量(SFA和SMO)。我需要以下输出
out<- df <- data.frame(IND= seq(1:20),
S = LETTERS[1:20],
FA=c(0,0,133,0,2,2,2,0,0,4,4,4,6,6,0,0,0,4,2,8),
MO = c(77,0,77,1,3,1,1,1,0,3,1,5,5,3,0,0,100,3,5,5),
SFA=c(0,0,133,0,"B","B","B",0,0,"D","D","D","F","F",0,0,0,"D","B","H"),
SMO=c(77,0,77,"A","C","A","A","A",0,"C","A","E","E","C",0,0,100,"C","E","E"))
我尝试匹配变量并在合并后,但效果不佳。
感谢
答案 0 :(得分:1)
要从S
FA
(MO
)== IND
获取相应的值,您可以使用match
函数查找索引和来自S
的子集为S[match(FA, IND)]
(S[match(MO, IND)]
),然后使用coalesce
函数将匹配结果中的NA
填充为原始向量中的值:
library(dplyr)
df %>% mutate(SFA = coalesce(as.character(S[match(FA, IND)]), as.character(FA)),
SMO = coalesce(as.character(S[match(MO, IND)]), as.character(MO)))
# IND S FA MO SFA SMO
#1 1 A 0 77 0 77
#2 2 B 0 0 0 0
#3 3 C 133 77 133 77
#4 4 D 0 1 0 A
#5 5 E 2 3 B C
#6 6 F 2 1 B A
#7 7 G 2 1 B A
#8 8 H 0 1 0 A
# ...