我们假设我有一个类似下面的数据框,并希望在每一行中取最高值并将其关联的列名称放入一个新的向量(不是值本身),我怎么能这样做?
df <- data.frame(matrix(rnorm(50, 20), 5))
X1 X2 X3 X4 X5 X6 X7 X8 X9 X10
1 18.49755 18.98823 18.53194 18.86478 20.74333 18.04460 21.08717 21.75072 18.05813 19.08402
2 20.44626 20.07205 19.36755 17.14943 18.58396 20.76463 20.23776 18.90171 18.99182 20.51338
3 20.27142 18.74448 21.42953 20.13568 20.40065 22.26788 19.30967 20.51772 19.20067 19.75371
4 20.61600 21.27852 18.54137 20.84269 20.27767 20.70583 21.33051 20.03136 20.60405 21.24672
5 19.64165 21.20197 20.06732 19.59529 20.48761 19.83571 19.80155 21.02669 20.77574 21.21862
我试过了
results <- apply(df, 1, max)
这给了我最高的值,但我更感兴趣的是与添加到结果向量的最高值相关联的列名,而不是值本身。
所以,不是按行排列的5个最高值的向量,而是有一个列名称的向量,其中&#34;赢了&#34;比如
result <- c("X1", "X3", "X2", "X1", "X9")
谢谢。
答案 0 :(得分:4)
使用which.max
:
names(df)[apply(df, 1,which.max)]
答案 1 :(得分:2)
您可以在apply
函数中添加一个步骤,以返回与max
请注意,我在制作随机样本时使用了set.seed()
set.seed(123)
df <- data.frame(matrix(rnorm(50, 20), 5))
apply(df, 1, function(x) {
names(x)[x == max(x)]
})
# [1] "X4" "X6" "X1" "X9" "X6"
df
# X1 X2 X3 X4 X5 X6 X7 X8 X9 X10
# 1 19.43952 21.71506 21.22408 21.78691 18.93218 18.31331 20.42646 20.68864 19.30529 18.87689
# 2 19.76982 20.46092 20.35981 20.49785 19.78203 20.83779 19.70493 20.55392 19.79208 19.59712
# 3 21.55871 18.73494 20.40077 18.03338 18.97400 20.15337 20.89513 19.93809 18.73460 19.53334
# 4 20.07051 19.31315 20.11068 20.70136 19.27111 18.86186 20.87813 19.69404 22.16896 20.77997
# 5 20.12929 19.55434 19.44416 19.52721 19.37496 21.25381 20.82158 19.61953 21.20796 19.91663
只是为了踢,一个过分的dplyr
&amp; reshape2
变体
library(dplyr)
library(reshape2)
df$row <- row.names(df)
melt(df) %>%
group_by(row) %>%
arrange(desc(value)) %>%
slice(1)
# Source: local data frame [5 x 3]
# Groups: row [5]
#
# row variable value
# (chr) (fctr) (dbl)
# 1 1 X4 21.78691
# 2 2 X6 20.83779
# 3 3 X1 21.55871
# 4 4 X9 22.16896
# 5 5 X6 21.25381
答案 2 :(得分:0)
我是这样做的,让我知道它对你有好处。
df <- data.frame(matrix(rnorm(50, 20), 5))
my_list <- {}
for (i in 1:nrow(df)){
x <- df[i,]
y <- sort(x,decreasing = T)
my_list[i] <- paste0("X",grep(y[1],x))
}
> my_list
[1] "X5" "X7" "X7" "X6" "X8"
> df
X1 X2 X3 X4 X5 X6 X7 X8 X9 X10
1 19.22859 19.78252 20.08969 19.60546 21.09189 18.27778 18.53504 19.38758 18.14770 20.64938
2 20.23044 21.90423 19.91845 21.06613 21.82551 21.08873 22.05754 19.81582 20.74686 19.38851
3 19.83008 19.58174 21.42340 19.66734 20.64790 19.72775 22.35714 19.23881 21.81957 19.44846
4 20.96194 20.17202 20.82502 19.11394 20.18380 21.64440 19.46687 19.73009 18.89267 20.89549
5 19.83232 20.40958 19.94605 19.49419 19.80325 20.39628 19.59710 21.84272 20.02212 21.22459
>