使用mutate_at和which.max对数据帧的选定列进行操作

时间:2017-11-21 21:31:47

标签: r dplyr tidyverse

我正在尝试使用mutate_atwhich.max的组合来操纵数据框,如下所示。

#This is basically what I want to achieve
df_want <- iris %>% group_by(Species) %>% mutate(Sepal.Length = Sepal.Length[which.max(Petal.Width)],
                                      Sepal.Width = Sepal.Width[which.max(Petal.Width)])

#Here is my attempt at a smarter solution, but it does not work
df_attempt <- iris %>% group_by(Species) %>% mutate_at(c("Sepal.Length", "Sepal.Width"), function(x) x[which.max("Petal.Width")])

#However, this works
df_test <- iris %>% group_by(Species) %>% mutate_at(c("Sepal.Length", "Sepal.Width"), function(x) x + 100)

生成df_attempt的代码不起作用。我收到以下错误消息:

Error in mutate_impl(.data, dots) : 
  Column `Sepal.Length` must be length 50 (the group size) or one, not 0

如何在仍使用mutate_at的情况下解决这个问题?

1 个答案:

答案 0 :(得分:2)

标准的dplyr方式是:

df_want <- iris %>% 
  group_by(Species) %>% 
  mutate(Sepal.Length = Sepal.Length[which.max(Petal.Width)],
         Sepal.Width = Sepal.Width[which.max(Petal.Width)])

df_attempt <- iris %>% 
  group_by(Species) %>% 
  mutate_at(vars(Sepal.Length, Sepal.Width), funs(.[which.max(Petal.Width)]))

<强>结果:

# A tibble: 150 x 5
# Groups:   Species [3]
   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
          <dbl>       <dbl>        <dbl>       <dbl>  <fctr>
 1            5         3.5          1.4         0.2  setosa
 2            5         3.5          1.4         0.2  setosa
 3            5         3.5          1.3         0.2  setosa
 4            5         3.5          1.5         0.2  setosa
 5            5         3.5          1.4         0.2  setosa
 6            5         3.5          1.7         0.4  setosa
 7            5         3.5          1.4         0.3  setosa
 8            5         3.5          1.5         0.2  setosa
 9            5         3.5          1.4         0.2  setosa
10            5         3.5          1.5         0.1  setosa
# ... with 140 more rows

> identical(df_want, df_attempt)
[1] TRUE

注意:

  1. 使用vars,您可以使用NSE引用变量。

  2. 使用funs,您可以使用.引用每列,相当于function(x) x