R:具有多个平行结果的分层ifelse条件

时间:2017-01-10 22:01:08

标签: r

我正在寻找以下问题的优雅解决方案:

我需要根据不同的匹配标准将所有者分配给公司。这些匹配标准具有不同的质量,因此只有在较高质量标准不产生结果时才应使用质量较弱的标准。在我的示例中,所有a条件都具有与b条件相同的质量级别和更高的质量。

以下说明我的观点:

firmname <- c("Firm A", "Firm B", "Firm C", "Firm D", "Firm E", "Firm F")
ownermatch_a1 <- c("Owner 1", NA, NA, NA, "Owner 5", "Owner 6")
ownermatch_a2 <- c("Owner 1", NA, NA, "Owner 4", "Owner 5", "Owner 6")
ownermatch_a3 <- c("Owner 1", NA, "Owner 3", "Owner 4", "Owner 5", "Owner 6")
ownermatch_b1 <- c("Owner 1", "Owner 2", "Owner 3", "Owner 4", "Owner 5", "Owner 6")
ownerfinal <- (NA)

data.frame(firmname, ownermatch_a1, ownermatch_a2, ownermatch_a3, ownermatch_b1, ownerfinal)

这会产生下表

 firmname ownermatch_a1 ownermatch_a2 ownermatch_a3 ownermatch_b1 ownerfinal
1   Firm A       Owner 1       Owner 1       Owner 1       Owner 1       <NA>
2   Firm B          <NA>          <NA>          <NA>       Owner 2       <NA>
3   Firm C          <NA>          <NA>       Owner 3       Owner 3       <NA>
4   Firm D          <NA>       Owner 4       Owner 4       Owner 4       <NA>
5   Firm E       Owner 5       Owner 5       Owner 5       Owner 5       <NA>
6   Firm F       Owner 6       Owner 6       Owner 6       Owner 6       <NA>

我现在想让R做以下事情: 1)如果3个a条件中的任何一个是非NA,则将其设为ownerfinal。 2)如果有多个并行a非NA,请随机选择其中任何一个,并将其设置为ownerfinal 3)只有当所有这些都是NA时,取ownermatch_b1并将其设为ownerfinal

所以在上面的例子中: 公司A:选择a1,a2,a3中的任何一个 公司B:选择b1 公司C:选择a3 公司D:选择a2或a3

谢谢!

2 个答案:

答案 0 :(得分:2)

这里不需要循环。 ?max.col是您在列中查找有效案例并随机选择一个案例的朋友:

tmp <- dat[2:4][cbind(seq_len(nrow(dat)), max.col(is.na(dat[2:4])))]
dat$ownerfinal <- replace(tmp, is.na(tmp), as.character(dat$ownermatch_b1)[is.na(tmp)])
dat

#  firmname ownermatch_a1 ownermatch_a2 ownermatch_a3 ownermatch_b1 ownerfinal
#1   Firm A       Owner 1       Owner 1       Owner 1       Owner 1    Owner 1
#2   Firm B          <NA>          <NA>          <NA>       Owner 2    Owner 2
#3   Firm C          <NA>          <NA>       Owner 3       Owner 3    Owner 3
#4   Firm D          <NA>       Owner 4       Owner 4       Owner 4    Owner 4
#5   Firm E       Owner 5       Owner 5       Owner 5       Owner 5    Owner 5
#6   Firm F       Owner 6       Owner 6       Owner 6       Owner 6    Owner 6

如果您想获得第一个有效结果,也可以使用pmax

do.call(pmax, c(lapply(dat[2:5],as.character), na.rm=TRUE) )
#[1] "Owner 1" "Owner 2" "Owner 3" "Owner 4" "Owner 5" "Owner 6"

答案 1 :(得分:0)

doLookup <- function(x){
  for(i in 2:5){
    if(!is.na(x[i]))
      return(x[i])
  }
  return(NA)
}

#loop through each record and make assignment
for(j in 1:nrow(df))
   df[j,6] <- doLookup(df[j,])
df
  firmname ownermatch_a1 ownermatch_a2 ownermatch_a3 ownermatch_b1 ownerfinal
1   Firm A       Owner 1       Owner 1       Owner 1       Owner 1    Owner 1
2   Firm B          <NA>          <NA>          <NA>       Owner 2    Owner 2
3   Firm C          <NA>          <NA>       Owner 3       Owner 3    Owner 3
4   Firm D          <NA>       Owner 4       Owner 4       Owner 4    Owner 4
5   Firm E       Owner 5       Owner 5       Owner 5       Owner 5    Owner 5
6   Firm F       Owner 6       Owner 6       Owner 6       Owner 6    Owner 6