我的数据如下:
library(tidyverse)
df <- tribble(
~y_val, ~z_val,
2, 4,
5, 3,
8, 2,
1, 1,
9, 3)
我有自定义函数fun_b()
,我希望通过dplyr :: mutate调用将其应用于数据框。但是,fun_b()
使用函数fun_a()
,其中包含一个循环:
fun_a <- function(x, y, z, times = 1) {
df <- data.frame()
for (i in 1:times) {
x <- x * 2 + i * x
y <- y / 3 + i * y
z <- z + 1 + z * i
d <- data.frame(x, y, z)
df <- rbind(df, d)
}
return(df)
}
fun_b <- function(x, y, z, times = 1) {
df <- fun_a(x, y, z, times)
x_r <- sum(df$x)
y_r <- sum(df$y)
z_r <- sum(df$z)
val <- x_r / y_r * z_r
return(val)
}
当我运行自定义功能时:
df %>%
mutate(test = fun_b(x = 1, y = y_val, z = z_val, times = 1))
test
中的每个变异值都显示相同的值(13.95)。这没有意义!例如,tibble (y_val = 2, z_val = 4)
中的第一行应为10.125!
fun_b(x = 1, y = 2, z = 4, times = 1)
这里发生了什么?
答案 0 :(得分:1)
尝试以下
df %>%
group_by(y_val, z_val) %>%
mutate(test = fun_b(x = 1, y = y_val, z = z_val, times = 1))
它让我得到了10.125。
答案 1 :(得分:1)
您可以按行分组,以便为每一行单独评估函数:
df %>%
rowwise() %>%
mutate(test = fun_b(x = 1, y = y_val, z = z_val, times = 1))
## Source: local data frame [5 x 3]
## Groups: <by row>
##
## # A tibble: 5 × 3
## y_val z_val test
## <dbl> <dbl> <dbl>
## 1 2 4 10.12500
## 2 5 3 3.15000
## 3 8 2 1.40625
## 4 1 1 6.75000
## 5 9 3 1.75000
或编辑fun_b
以使其得到矢量化,或者只是让R:
df %>% mutate(test = Vectorize(fun_b)(x = 1, y = y_val, z = z_val, times = 1))
## # A tibble: 5 × 3
## y_val z_val test
## <dbl> <dbl> <dbl>
## 1 2 4 10.12500
## 2 5 3 3.15000
## 3 8 2 1.40625
## 4 1 1 6.75000
## 5 9 3 1.75000