我尝试为每一行标识具有最高和第二高值的列名称,如果它们在行中具有不等于零的值。数据集:
DT=data.frame(Row=c(1,2,3,4,5),Price=c(2.1,2.1,2.2,2.3,2.5),
'2.0'= c(100,300,700,400,0),
'2.1'= c(400,200,100,500,0),
'2.2'= c(600,700,200,100,-200),
'2.3'= c(300,0,-300,100,100),
'2.4'= c(400,0,0,500,600),
'2.5'= c(0,200,0,800,-100),check.names=FALSE)
目标是具有最高列值,其在最高列中具有不等于零的任何值,并且具有第二高值的列在第二列中具有任何不等于零的值:
DT=data.frame(Row=c(1,2,3,4,5),Price=c(2.1,2.1,2.2,2.3,2.5),
'2.0'= c(100,300,700,400,0),
'2.1'= c(400,200,100,500,0),
'2.2'= c(600,700,200,100,-200),
'2.3'= c(300,0,-300,100,100),
'2.4'= c(400,0,0,500,600),
'2.5'= c(0,200,0,800,-100),check.names=FALSE,
Highest=c(2.4,2.5,2.3,2.5,2.5),Second=c(2.3,2.2,2.3,2.4,2.4))
Highest的代码是:
DT$Highest <- apply(DT[-1], 1, function(x) max(as.numeric(names(which(x>0|x<0)))))
干杯
答案 0 :(得分:2)
DT$highest = colnames(DT)[2+apply(DT[,3:8], 1, function(x)
which(x != 0 & x == sort(x, decreasing = TRUE)[1])[1])]
#[1] "2.2" "2.2" "2.0" "2.5" "2.4"
DT$second_highest = colnames(DT)[2+apply(DT[,3:8], 1, function(x)
which(x != 0 & x == sort(x, decreasing = TRUE)[2])[1])]
#[1] "2.1" "2.0" "2.2" "2.1" "2.3"