如何使用dplyr

时间:2019-12-06 23:47:33

标签: r dplyr tidyr

给出以下数据:

df <- data.frame(
  a = c(1,2,3,5),
  b = c(7,9,52,4),
  c = c(53, 11,22,1),
  d = c("something","string","another", "here")
)

哪个看起来像:

  a  b  c         d
1 1  7 53 something
2 2  9 11    string
3 3 52 22   another
4 5  4  1      here

我想使用dplyr创建列“ max”,其中max是 最大的行值。

因此,以上所述

  a  b  c         d  max
1 1  7 53 something   c
2 2  9 11    string   c
3 3 52 22   another   b
8 5  4  1      here   a

3 个答案:

答案 0 :(得分:1)

我们可以使用max.col在每一行上找到最大值的列索引,使用它来获取列名并为ass'max'列分配

df['max'] <- names(df)[1:3][max.col(df[1:3], "first")]
df
#  a  b  c         d max
#1 1  7 53 something   c
#2 2  9 11    string   c
#3 3 52 22   another   b
#4 5  4  1      here   a

使用tidyverse,另一种方法是将其重塑为'long'格式,然后找到max

library(dplyr)
library(tidyr)
df %>%
   mutate(ind = row_number()) %>%
   select(-d) %>%
   pivot_longer(cols = a:c) %>%
   group_by(ind) %>%
   slice(which.max(value)) %>%
   select(-value) %>%
   pull(name) %>%
   mutate(df, max = .)

或与pmap

library(purrr)
df %>% 
   mutate(max = pmap_chr(select(., a:c), ~ c(...) %>% 
                                   which.max %>% 
                                   names ))

答案 1 :(得分:1)

apply(df,2,max)>>假设您的数据框名为df

答案 2 :(得分:1)

df %>%
    group_by(ind = row_number()) %>%
    mutate(max = c("a", "b", "c")[which.max(c(a, b, c))]) %>%
    ungroup() %>%
    select(-ind)
## A tibble: 4 x 5
#      a     b     c d         max  
#  <dbl> <dbl> <dbl> <fct>     <chr>
#1     1     7    53 something c    
#2     2     9    11 string    c    
#3     3    52    22 another   b    
#4     4     5     1 here      b