Question

我有一个数据框df1和一个列表l1，如下所示：

df1 <- data.frame(c1 = c(4.2, 1.2, 3.0) , c2 = c(2.3, 1.8, 12.0 ) ,c3 = c(1.2, 3.2, 2.0 ) , c4 = c(2.2, 1.9, 0.9) )
l1 <- list(x1 = c(2,4) ,x2 = c(3) ,x3 = c(2))

其中l1包含df1中要忽略的索引列表。现在，我希望在排除每行的列表l1中的索引之后找到前2个（可以更高）元素的索引。实际数据包含更多行和列。所以，预期的输出是：

[1,]    1 3
[2,]    2 4
[3,]    1 3

其中df1：

      c1   c2  c3  c4
1    4.2  2.3 1.2 2.2
2    1.2  1.8 3.2 1.9
3    3.0 12.0 2.0 0.9

如果索引可以按其占位符值的顺序排列，那么这也很有帮助。那么预期的输出将是：

 [1,]    1 3
 [2,]    4 2
 [3,]    1 3

Answer 1

我们可以使用rank

lapply(seq_len(nrow(df1)), function(i) {
      x1 <- unlist(df1[i,])
      i2 <- l1[[i]]
      i3 <- seq_along(x1) %in% i2
      which(rank(-x1*NA^i3) %in% 1:2) })
#[[1]]
#[1] 1 3

#[[2]]
#[1] 2 4

#[[3]]
#[1] 1 3

更新

如果我们需要order

lapply(seq_len(nrow(df1)), function(i) {
  x1 <- unlist(df1[i,])
  i2 <- l1[[i]]
  i3 <- seq_along(x1) %in% i2
  i4 <- which(rank(-x1*NA^i3) %in% 1:2)
  i4[order(-x1[i4])]      

    })
#[[1]]
#[1] 1 3

#[[2]]
#[1] 4 2

#[[3]]
#[1] 1 3

Answer 2

我理解如下问题。对于i的每一行df1，排除编号为l1[i]的元素，然后给出最大的两个剩余元素的索引。

highest.two <- function(x){
  first.highest_position <- which.max(x) 
  second.highest_value <- max(x[-first.highest_position])
  second.highest_position <- which(x == second.highest_value)
  return(c(first.highest_position, second.highest_position))
}

ret <- matrix(NA, nrow = nrow(df1), ncol = 2)
for(i in 1:nrow(df1)){
  tmp <- df1[i, ]
  tmp[l1[i][[1]]] <- -Inf
  ret[i, ] <- highest.two(tmp) #if you want to have these indices ordered use sort(highest.two(tmp))
}
ret

Answer 3

也使用排名但返回矩阵。 t()将data.frame转换为矩阵

，语法变得有点难看

df1 <- data.frame(c1 = c(4.2, 1.2, 3.0) , c2 = c(2.3, 1.8, 12.0 ) ,c3 = c(1.2, 3.2, 2.0 ) , c4 = c(2.2, 1.9, 0.9) )
l1 <- list(x1 = c(2,4) ,x2 = c(3) ,x3 = c(2))


indexOrderSub <- function( df , excl  , top = 2) {
    z <- 1:length(df)
    sel <-  !( z  %in%  excl )
    rz <- z[ sel   ]
    rz2 <- tail( rz[order(  rank(df)[ sel ]   )] , top)
    rz2[order(rz2)]
}


t( mapply( indexOrderSub , as.data.frame(t(df1)) , l1))

忽略选定的索引

3 个答案:

更新