summarise_each有效但是summarise_all没有

时间:2017-12-01 20:52:57

标签: r dplyr

我在dplyr中使用了summarise_all函数。

当我使用已弃用的版本时," summarise_each"它工作正常,但当我做summarise_all时,我收到一个错误。

数据集:

Date <- as.Date(c('2017-10-16',
              '2017-10-16',
              '2017-10-17',
              '2017-10-17',
              '2017-10-18',
              '2017-10-18',
              '2017-10-19',
              '2017-10-19',
              '2017-10-20',
              '2017-10-20'))

Source <- as.Date(c('2017-11-29',
                '2017-11-30',
                '2017-11-29',
                '2017-11-30',
                '2017-11-29',
                '2017-11-30',
                '2017-11-29',
                '2017-11-30',
                '2017-11-29',
                '2017-11-30'))

Column1 <- c("A","A","A","A","A","B","B","B","B","B")

Column2 <- c("A","A","A","A","A","B","B","B","B","B")


Revenue <- c(206.88,
         210.88,
         194.13,
         200.13,
         170.00,
         170.00,
         746.65,
         736.65,
         772.00,
         772.00)

Cost <- c(100.88,
      10.88,
      85.13,
      100.13,
      170.00,
      100.00,
      46.65,
      50.65,
      23.00,
      24.00)

df <- data.frame(Date, Source, Column1, Column2, Revenue, Cost)

数据框:

df

             Date     Source Column1 Column2 Revenue   Cost
    1  2017-10-16 2017-11-29       A       A  206.88 100.88
    2  2017-10-16 2017-11-30       A       A  210.88  10.88
    3  2017-10-17 2017-11-29       A       A  194.13  85.13
    4  2017-10-17 2017-11-30       A       A  200.13 100.13
    5  2017-10-18 2017-11-29       A       A  170.00 170.00
    6  2017-10-18 2017-11-30       B       B  170.00 100.00
    7  2017-10-19 2017-11-29       B       B  746.65  46.65
    8  2017-10-19 2017-11-30       B       B  736.65  50.65
    9  2017-10-20 2017-11-29       B       B  772.00  23.00
    10 2017-10-20 2017-11-30       B       B  772.00  24.00

这是summarise_each的代码:

by_date_test<-df %>%
group_by(Date) %>%
summarise_each(funs(sum), -c(`Column1`, 
                             `Column2`))

我收到了一个新的数据框,但是出现了警告:

`summarise_each()` is deprecated.
Use `summarise_all()`, `summarise_at()` or `summarise_if()` instead.
To map `funs` over a selection of variables, use `summarise_at()`

当我用summarise_all尝试时,这是我得到的错误:

 by_date_test<-df %>%
  group_by(Date) %>%
  summarise_all(funs(sum), -c(`Column1`, 
                              `Column2`))

Error in -c(Column1, Column2) : invalid argument to unary operator

我对summarise_all做错了什么?我的实际数据集也有大约1000列,我想排除选定的列。

谢谢!

1 个答案:

答案 0 :(得分:1)

y_date_test<-df %>%
  group_by(Date) %>%
  summarise_at(vars(-Column1, -Column2), sum)

演示:

group_by(mtcars, cyl) %>%
    summarise_at(vars(-mpg, -wt), mean)
# # A tibble: 3 x 9
#     cyl     disp        hp     drat     qsec        vs        am     gear     carb
#   <dbl>    <dbl>     <dbl>    <dbl>    <dbl>     <dbl>     <dbl>    <dbl>    <dbl>
# 1     4 105.1364  82.63636 4.070909 19.13727 0.9090909 0.7272727 4.090909 1.545455
# 2     6 183.3143 122.28571 3.585714 17.97714 0.5714286 0.4285714 3.857143 3.428571
# 3     8 353.1000 209.21429 3.229286 16.77214 0.0000000 0.1428571 3.285714 3.500000