Question

我想根据几个其他向量的条件创建一个向量。条件按优先顺序递减。这是一个简单的例子，我在其中创建变量'see1'，它应该包含不同的字母（但不是NA）。创建它的优先级是分层的：l1＆gt; l2＆gt; l3＆gt; L4。例如。如果所有其他条件都是NA，则'see1'只能被指定为'l4'状态，如果'l1'不是NA（'l1'否决其他列），它将自动分配'l1'状态。我使用嵌套的ifelse来创建'see1'。

test <- data.frame(id=c("a","b","c","d","e","f"),
               l1=c(NA,NA,"A",NA,"B", NA),
               l2=c(NA,NA,"N","N",NA,NA),
               l3=c("V",NA,NA,NA,"V","V"), 
               l4=c("H","H",NA,NA,rep("H",2)), stringsAsFactors=F)
test$see1 <- ifelse(test$l1%in%c("A", "B"), test$l1,
               ifelse(test$l2%in%"N", "N", 
                  ifelse(test$l3%in%"V", "V",
                        ifelse(test$l4%in%"H","H", NA))))
test

id   l1   l2   l3   l4 see1
1  a <NA> <NA>    V    H    V
2  b <NA> <NA> <NA>    H    H
3  c    A    N <NA> <NA>    A
4  d <NA>    N <NA> <NA>    N
5  e    B <NA>    V    H    B
6  f <NA> <NA>    V    H    V

但是，对于许多条件/列，此任务变得很麻烦。我已经扫描了类似'嵌套ifelse'的问题，但没有遇到这个问题。

Answer 1

你可以在max.col cbind ties.method='first'上尝试使用l\\d' columns to create the column index. with 1：nrow（test）`从'test'数据集的子集中提取元素行/列索引。

nm <- grep('^l\\d+', names(test))
test[nm][cbind(1:nrow(test), max.col(!is.na(test[nm]), 'first'))]
#[1] "V" "H" "A" "N" "B" "V"

或使用apply

的一些选项

 apply(test[nm], 1, function(x) x[Position(function(y) !is.na(y), x)])
 #[1] "V" "H" "A" "N" "B" "V"

  apply(test[nm], 1, function(x) x[!is.na(x)][1])
  #[1] "V" "H" "A" "N" "B" "V"

Answer 2

这是合并解决方案：

首先，在测试中对列进行重新排序（在我的示例中没有必要对列进行排序，但在其他情况下可能很重要）

require(dplyr)
require(magrittr) # for piping
test %<>% select(l1,l2, l3, l4)

现在使用合并功能

coalesce2 <- function(...){
 Reduce(function(x,y) {
    i<-which(is.na(x))
    x[i]<-y[i]
    x},
    list(...))
}

test$see1 <- coalesce2(test$l1,test$l2, test$l3, test$l4)
test

或（再次）在magrittr包的帮助下

require(magrittr)
test$see1 <- test%$% coalesce2(l1,l2, l3, l4)
test

>    l1   l2   l3   l4 see1
>1 <NA> <NA>    V    H    V
>2 <NA> <NA> <NA>    H    H
>3    A    N <NA> <NA>    A
>4 <NA>    N <NA> <NA>    N
>5    B <NA>    V    H    B
>6 <NA> <NA>    V    H    V

替代基于具有不同优先级的条件的嵌套ifelse

2 个答案: