R dplyr:在group_by之后使用init进行Reduce变异

时间:2017-09-30 20:45:10

标签: r dplyr

是否可以为Reduce指定初始值而不将其添加到数据框中?

例如,使用功能:

f <- function(x, y) if (y<0) -x * y else x + y

对数据框采取行动:

    set.seed(0)
    df <- c(-0.9, sample(c(-0.9, 1:3), 9, replace = TRUE)) %>% tibble()
    names(df) <- "x"
    df <- df %>% mutate(id = 'a')
    df$id[6:10] <- 'b'
    df <- df %>% group_by(id) %>% mutate(sumprod = Reduce(f, x, acc=TRUE)) %>% ungroup()
    df$target <- c(0, 3, 4, 5, 7, 3, 2.7, 5.7, 8.7, 10.7)
    df

# A tibble: 10 x 4
       x    id sumprod target
   <dbl> <chr>   <dbl>  <dbl>
 1  -0.9     a    -0.9    0.0
 2   3.0     a     2.1    3.0
 3   1.0     a     3.1    4.0
 4   1.0     a     4.1    5.0
 5   2.0     a     6.1    7.0
 6   3.0     b     3.0    3.0
 7  -0.9     b     2.7    2.7
 8   3.0     b     5.7    5.7
 9   3.0     b     8.7    8.7
10   2.0     b    10.7   10.7

目标是专栏target。我尝试将init与Reduce一起使用,但这会增加额外的元素。

Reduce(f, df$x[1:5], acc=TRUE, init=0)
 [1]  0  0  3  4  5  7

在mutate中使用它会产生错误:

> df <- df %>% group_by(id) %>% mutate(sumprod = Reduce(f, x, acc=TRUE, init=0)) %>% ungroup()
Error in mutate_impl(.data, dots) : 
  Column `sumprod` must be length 5 (the group size) or one, not 6

2 个答案:

答案 0 :(得分:2)

如果给出了init,则Reduce逻辑上将它添加到开始(从左到右)或x的结尾。如果你不需要元素,你可以使用tail(..., -1)删除第一个元素:

df %>% 
    group_by(id) %>% 
    mutate(sumprod = tail(Reduce(f, x, acc=TRUE, init=0), -1)) %>% 
    ungroup()

# A tibble: 10 x 4
#       x    id sumprod target
#   <dbl> <chr>   <dbl>  <dbl>
# 1  -0.9     a     0.0    0.0
# 2   3.0     a     3.0    3.0
# 3   1.0     a     4.0    4.0
# 4   1.0     a     5.0    5.0
# 5   2.0     a     7.0    7.0
# 6   3.0     b     3.0    3.0
# 7  -0.9     b     2.7    2.7
# 8   3.0     b     5.7    5.7
# 9   3.0     b     8.7    8.7
#10   2.0     b    10.7   10.7

答案 1 :(得分:2)

使用tidyverse,来自accumulate

purrr
library(tidyverse)
df %>%
   group_by(id) %>%
   mutate(sumprod = accumulate(.x = x, .f = f, .init = 0)[-1]) %>%
   ungroup
# A tibble: 10 x 3
#       x    id sumprod
#   <dbl> <chr>   <dbl>
# 1  -0.9     a     0.0
# 2   3.0     a     3.0
# 3   1.0     a     4.0
# 4   1.0     a     5.0
# 5   2.0     a     7.0
# 6   3.0     b     3.0
# 7  -0.9     b     2.7
# 8   3.0     b     5.7
# 9   3.0     b     8.7
#10   2.0     b    10.7