Question

我想从表中取两个变量并将它们除以第三个变量，并将这些计算添加为两个新列。 mutate_at()让我非常接近，但在下面的自定义函数f()中，我想访问数据集中的另一列。任何建议或其他整洁的工具方法？

library(dplyr)
# this works fine but is NOT what I want
f <- function(fld){
  fld/5
}

# This IS what I want where wt is a field in the data
f <- function(fld){
  fld/wt
}

mutate_at(mtcars, .vars = vars(mpg, cyl), .funs = funs(xyz = f))

# This works but is pretty clumsy
f <- function(fld, dat) fld/dat$wt
mutate_at(mtcars, .vars = vars(mpg, cyl), .funs = funs(xyz = f(., mtcars)))

# This is closer but still it would be better if the function allowed the dataset to be submitted to the function without restating the name of the dataset

f <- function(fld, second){
  fld/second
}

mutate_at(mtcars, .vars = vars(mpg, cyl), .funs = funs(xyz = f(., wt)))

Answer 1

也许是这样的？

f <- function(fld,var){
    fld/var
}

mtcars %>%
    mutate_at(vars(mpg,cyl), .funs = funs(xyz = f(.,wt)))

Answer 2

library(tidyverse)
f <- function(num, denom) num/denom

mtcars %>% 
  mutate_at(vars(mpg, cyl), f, denom = quote(wt))

虽然在此特定示例中，不需要自定义功能。

mtcars %>% 
  mutate_at(vars(mpg, cyl), `/`, quote(wt))

Answer 3

有一个 cur_data() 函数有助于使 mutate_at() 调用更加紧凑，因为您不必为正在应用的函数指定第二个参数每列：

f <- function(fld){
  fld / cur_data()$wt
}
mutate_at(mtcars, .vars=vars(mpg, cyl), .funs=funs(xyz = f))

附加说明：

如果您需要该函数引用分组变量，请使用 cur_data_all()
mutate_at 现在被 mutate(.data, across()) 取代，所以最好这样做

mtcars %>% mutate(across(.cols=c(mpg, cyl), .fns=f, .names='{.col}_xyz'))

Answer 4

为什么不简单

mutate(mtcars, mpg2 = mpg / wt, cyl2 = cyl / wt)

使用dplyr mutate_at和自定义函数

4 个答案: