使用关联值拉取列名称,并仅将列名称添加到新向量

时间:2017-02-22 23:58:04

标签: r

我们假设我有一个类似下面的数据框,并希望在每一行中取最高值并将其关联的列名称放入一个新的向量(不是值本身),我怎么能这样做?

df <- data.frame(matrix(rnorm(50, 20), 5))

  X1       X2       X3       X4       X5       X6       X7       X8       X9      X10
1 18.49755 18.98823 18.53194 18.86478 20.74333 18.04460 21.08717 21.75072 18.05813 19.08402
2 20.44626 20.07205 19.36755 17.14943 18.58396 20.76463 20.23776 18.90171 18.99182 20.51338
3 20.27142 18.74448 21.42953 20.13568 20.40065 22.26788 19.30967 20.51772 19.20067 19.75371
4 20.61600 21.27852 18.54137 20.84269 20.27767 20.70583 21.33051 20.03136 20.60405 21.24672
5 19.64165 21.20197 20.06732 19.59529 20.48761 19.83571 19.80155 21.02669 20.77574 21.21862

我试过了

results <- apply(df, 1, max)

这给了我最高的值,但我更感兴趣的是与添加到结果向量的最高值相关联的列名,而不是值本身。

所以,不是按行排列的5个最高值的向量,而是有一个列名称的向量,其中&#34;赢了&#34;比如

result <- c("X1", "X3", "X2", "X1", "X9")

谢谢。

3 个答案:

答案 0 :(得分:4)

使用which.max

names(df)[apply(df, 1,which.max)]

答案 1 :(得分:2)

您可以在apply函数中添加一个步骤,以返回与max

关联的列名称

请注意,我在制作随机样本时使用了set.seed()

set.seed(123)

df <- data.frame(matrix(rnorm(50, 20), 5))

apply(df, 1, function(x) {  
    names(x)[x == max(x)]  
})

# [1] "X4" "X6" "X1" "X9" "X6"
df

#         X1       X2       X3       X4       X5       X6       X7       X8       X9      X10
# 1 19.43952 21.71506 21.22408 21.78691 18.93218 18.31331 20.42646 20.68864 19.30529 18.87689
# 2 19.76982 20.46092 20.35981 20.49785 19.78203 20.83779 19.70493 20.55392 19.79208 19.59712
# 3 21.55871 18.73494 20.40077 18.03338 18.97400 20.15337 20.89513 19.93809 18.73460 19.53334
# 4 20.07051 19.31315 20.11068 20.70136 19.27111 18.86186 20.87813 19.69404 22.16896 20.77997
# 5 20.12929 19.55434 19.44416 19.52721 19.37496 21.25381 20.82158 19.61953 21.20796 19.91663

只是为了踢,一个过分的dplyr&amp; reshape2变体

library(dplyr)
library(reshape2)
df$row <- row.names(df)

melt(df) %>% 
    group_by(row) %>%
    arrange(desc(value)) %>%
    slice(1) 

# Source: local data frame [5 x 3]
# Groups: row [5]
# 
# row variable    value
# (chr)   (fctr)    (dbl)
# 1     1       X4 21.78691
# 2     2       X6 20.83779
# 3     3       X1 21.55871
# 4     4       X9 22.16896
# 5     5       X6 21.25381

答案 2 :(得分:0)

我是这样做的,让我知道它对你有好处。

df <- data.frame(matrix(rnorm(50, 20), 5))

my_list <- {}

for (i in 1:nrow(df)){
  x <- df[i,]
  y <- sort(x,decreasing = T)
  my_list[i] <- paste0("X",grep(y[1],x))
}

> my_list
[1] "X5" "X7" "X7" "X6" "X8"
> df
        X1       X2       X3       X4       X5       X6       X7       X8       X9      X10
1 19.22859 19.78252 20.08969 19.60546 21.09189 18.27778 18.53504 19.38758 18.14770 20.64938
2 20.23044 21.90423 19.91845 21.06613 21.82551 21.08873 22.05754 19.81582 20.74686 19.38851
3 19.83008 19.58174 21.42340 19.66734 20.64790 19.72775 22.35714 19.23881 21.81957 19.44846
4 20.96194 20.17202 20.82502 19.11394 20.18380 21.64440 19.46687 19.73009 18.89267 20.89549
5 19.83232 20.40958 19.94605 19.49419 19.80325 20.39628 19.59710 21.84272 20.02212 21.22459
>