在dplyr :: mutate

时间:2016-10-24 21:06:02

标签: r dplyr tidyverse

我的数据如下:

library(tidyverse)

df <- tribble(
    ~y_val, ~z_val,
    2, 4,
    5, 3, 
    8, 2, 
    1, 1, 
    9, 3)

我有自定义函数fun_b(),我希望通过dplyr :: mutate调用将其应用于数据框。但是,fun_b()使用函数fun_a(),其中包含一个循环:

fun_a <- function(x, y, z, times = 1) {

    df <- data.frame()
    for (i in 1:times) {
        x <- x * 2 + i * x
        y <- y / 3 + i * y
        z <- z + 1 + z * i
    d <- data.frame(x, y, z)
    df <- rbind(df, d)
    }
    return(df)
}

fun_b <- function(x, y, z, times = 1) {
    df <- fun_a(x, y, z, times)
    x_r <- sum(df$x)
    y_r <- sum(df$y)
    z_r <- sum(df$z)
    val <- x_r / y_r * z_r
    return(val)
}

当我运行自定义功能时:

df %>% 
    mutate(test = fun_b(x = 1, y = y_val, z = z_val, times = 1))

test中的每个变异值都显示相同的值(13.95)。这没有意义!例如,tibble (y_val = 2, z_val = 4)中的第一行应为10.125!

fun_b(x = 1, y = 2, z = 4, times = 1)

这里发生了什么?

2 个答案:

答案 0 :(得分:1)

尝试以下

df %>% 
    group_by(y_val, z_val) %>% 
    mutate(test = fun_b(x = 1, y = y_val, z = z_val, times = 1))

它让我得到了10.125。

答案 1 :(得分:1)

您可以按行分组,以便为​​每一行单独评估函数:

df %>% 
    rowwise() %>%
    mutate(test = fun_b(x = 1, y = y_val, z = z_val, times = 1))

## Source: local data frame [5 x 3]
## Groups: <by row>
## 
## # A tibble: 5 × 3
##   y_val z_val     test
##   <dbl> <dbl>    <dbl>
## 1     2     4 10.12500
## 2     5     3  3.15000
## 3     8     2  1.40625
## 4     1     1  6.75000
## 5     9     3  1.75000

或编辑fun_b以使其得到矢量化,或者只是让R:

df %>% mutate(test = Vectorize(fun_b)(x = 1, y = y_val, z = z_val, times = 1))

## # A tibble: 5 × 3
##   y_val z_val     test
##   <dbl> <dbl>    <dbl>
## 1     2     4 10.12500
## 2     5     3  3.15000
## 3     8     2  1.40625
## 4     1     1  6.75000
## 5     9     3  1.75000