我试图找到最大的列值和第二大的列值以及两列的名称。我正在努力获得第二大的列名称。
我试图编写一个lapply函数,该函数从考虑中删除了第一个最大值的值,但它使列名计数减少了。有什么建议吗?
temp<-data.frame(c(1,2,3,4),c(1,2,3,1),c(4,5,1,2),c(1,6,5,4),c(2,2,2,2))
colnames(temp)<-c("c1","c2","c3","c4","c5")
temp$MaxOrders<-as.numeric(apply(temp[,c(-1)],1,function(x){x[which.max(x)]}))
temp$secondMaxOrders<-as.numeric(apply(temp[,c(2,3,4,5)],1,function(x){x[order(x)[2]]}))
temp$MaxColName<-colnames(temp)[c(-1)][max.col(temp[,c(-1)],ties.method="first")]
temp
c1 c2 c3 c4 c5 MaxOrders secondMaxOrders MaxColName
1 1 1 4 1 2 4 1 c3
2 2 2 5 6 2 6 5 c4
3 3 3 1 5 2 5 3 c4
4 4 1 2 4 2 4 2 c4
目标:按列名查找第二高的
c1 c2 c3 c4 c5 MaxOrders secondMaxOrders MaxColName secondMaxColumnName
1 1 1 4 1 2 4 2 c3 c5
2 2 2 5 6 2 6 5 c4 c3
3 3 3 1 5 2 5 3 c4 c2
4 4 1 2 4 2 4 2 c4 c3
答案 0 :(得分:2)
我们可以通过一次apply
调用来做到这一点,方法是在每一行中找出2个最大值并返回其列名。
temp[c("MaxOrders", "secondMaxOrders", "MaxColName", "secondMaxColumnName")] <-
t(apply(temp, 1, function(x) {
inds <- order(x, decreasing = TRUE)[1:2]
c(x[inds], names(temp)[inds])
}))
temp
# c1 c2 c3 c4 c5 MaxOrders secondMaxOrders MaxColName secondMaxColumnName
#1 1 1 4 1 2 4 2 c3 c5
#2 2 2 5 6 2 6 5 c4 c3
#3 3 3 1 5 2 5 3 c4 c1
#4 4 1 2 4 2 4 4 c1 c4
或者,如果您想完全删除最大值,而仅考虑剩余的最大值,则为第二个
t(apply(temp, 1, function(x) {
inds <- match(unique(sort(x, decreasing=TRUE))[1:2], x)
c(x[inds], names(temp)[inds])
}))
# [,1] [,2] [,3] [,4]
#[1,] "4" "2" "c3" "c5"
#[2,] "6" "5" "c4" "c3"
#[3,] "5" "3" "c4" "c1"
#[4,] "4" "2" "c1" "c3"
答案 1 :(得分:2)
temp<-data.frame(c(1,2,3,1),c(4,5,1,2),c(1,6,5,4),c(2,2,2,2))
colnames(temp)<-c("c2","c3","c4","c5")
m1 = max.col(temp)
m2 = max.col(t(sapply(seq_along(m1), function(i)
replace(temp[i,], temp[i,] == temp[i, m1[i]], -Inf))))
max1 = temp[cbind(1:NROW(temp), m1)]
max2 = temp[cbind(1:NROW(temp), m2)]
data.frame(m1 = colnames(temp)[m1],
m2 = colnames(temp)[m2],
max1,
max2)
# m1 m2 max1 max2
#1 c3 c5 4 2
#2 c4 c3 6 5
#3 c4 c2 5 3
#4 c4 c5 4 2
答案 2 :(得分:0)
您可以使用一个密钥向量,该向量可以*Orders
和*ColName
一起c
进行连接:
key <- setNames(names(temp[1:5]), 1:5)
nms <- c("MaxOrders", "secondMaxOrders", "MaxColName", "secondMaxColumnName")
d <- t(sapply(seq(nrow(temp)), function(x) {
o <- order(-temp[x, 2:5])[1:2]
return(setNames(c(temp[x, o + 1], key[o + 1]), nms))
}))
这应该给您想要的结果:
cbind(temp, d)
# c1 c2 c3 c4 c5 MaxOrders secondMaxOrders MaxColName secondMaxColumnName
# 1 1 1 4 1 2 4 2 c3 c5
# 2 2 2 5 6 2 6 5 c4 c3
# 3 3 3 1 5 2 5 3 c4 c2
# 4 4 1 2 4 2 4 2 c4 c3
数据
temp <- structure(list(c1 = c(1, 2, 3, 4), c2 = c(1, 2, 3, 1), c3 = c(4, 5, 1, 2),
c4 = c(1, 6, 5, 4), c5 = c(2, 2, 2, 2)), class = "data.frame",
row.names = c(NA, -4L))