列名称和第二高值的值

时间:2017-02-27 17:51:57

标签: r

我尝试为每一行标识具有最高和第二高值的列名称,如果它们在行中具有不等于零的值。数据集:

DT=data.frame(Row=c(1,2,3,4,5),Price=c(2.1,2.1,2.2,2.3,2.5),
      '2.0'= c(100,300,700,400,0),
      '2.1'= c(400,200,100,500,0),
      '2.2'= c(600,700,200,100,-200),
      '2.3'= c(300,0,-300,100,100),
      '2.4'= c(400,0,0,500,600),
      '2.5'= c(0,200,0,800,-100),check.names=FALSE)

目标是具有最高列值,其在最高列中具有不等于零的任何值,并且具有第二高值的列在第二列中具有任何不等于零的值:

DT=data.frame(Row=c(1,2,3,4,5),Price=c(2.1,2.1,2.2,2.3,2.5),
      '2.0'= c(100,300,700,400,0),
      '2.1'= c(400,200,100,500,0),
      '2.2'= c(600,700,200,100,-200),
      '2.3'= c(300,0,-300,100,100),
      '2.4'= c(400,0,0,500,600),
      '2.5'= c(0,200,0,800,-100),check.names=FALSE,
      Highest=c(2.4,2.5,2.3,2.5,2.5),Second=c(2.3,2.2,2.3,2.4,2.4))

Highest的代码是:

DT$Highest <- apply(DT[-1], 1, function(x) max(as.numeric(names(which(x>0|x<0)))))

干杯

1 个答案:

答案 0 :(得分:2)

DT$highest = colnames(DT)[2+apply(DT[,3:8], 1, function(x)
                   which(x != 0 & x == sort(x, decreasing = TRUE)[1])[1])]
#[1] "2.2" "2.2" "2.0" "2.5" "2.4"  

DT$second_highest = colnames(DT)[2+apply(DT[,3:8], 1, function(x)
                   which(x != 0 & x == sort(x, decreasing = TRUE)[2])[1])]
#[1] "2.1" "2.0" "2.2" "2.1" "2.3"