dplyr错误:即使首次应用取消组合也无法修改分组变量

时间:2015-05-14 17:11:28

标签: r dplyr

我收到此错误,但相关帖子中的修补程序似乎不适用我正在使用ungroup,虽然不再需要它(can I switch the grouping variable in a single dplyr statement?但请参阅Format column within dplyr chain)。此外,我的group_by调用中没有引号,我没有应用任何作用于分组列(R dplyr summarize_each --> "Error: cannot modify grouping variable")的函数,但我仍然收到此错误:

>  games2 = baseball %>%
+  ungroup %>% 
+  group_by(id, year) %>%
+  summarize(total=g+ab, a = ab+1, id = id)%>%
+  arrange(desc(total)) %>%
+  head(10)
Error: cannot modify grouping variable

这是plyr附带的棒球套装:

           id year stint team lg  g  ab  r  h X2b X3b hr rbi sb cs bb so ibb hbp sh sf gidp
4   ansonca01 1871     1  RC1    25 120 29 39  11   3  0  16  6  2  2  1  NA  NA NA NA   NA
44  forceda01 1871     1  WS3    32 162 45 45   9   4  0  29  8  0  4  0  NA  NA NA NA   NA
68  mathebo01 1871     1  FW1    19  89 15 24   3   1  0  10  2  1  2  0  NA  NA NA NA   NA
99  startjo01 1871     1  NY2    33 161 35 58   5   1  1  34  4  2  3  0  NA  NA NA NA   NA
102 suttoez01 1871     1  CL1    29 128 35 45   3   7  3  23  3  1  1  0  NA  NA NA NA   NA
106 whitede01 1871     1  CL1    29 146 40 47   6   5  1  21  2  2  4  1  NA  NA NA NA   NA

我在plyr之前加载了dplyr。还有其他需要检查的错误?感谢您的任何更正/建议。

2 个答案:

答案 0 :(得分:1)

不清楚你在做什么。我认为以下是您正在寻找的:

games2 = baseball %>%
     group_by(id, year) %>%
      mutate(total=g+ab, a = ab+1)%>%
      arrange(desc(total)) %>%
     head(10)
> games2
Source: local data frame [10 x 24]
Groups: id, year

          id year stint team lg   g  ab   r   h X2b X3b hr rbi sb cs bb so ibb hbp sh sf gidp total   a
1  aaronha01 1954     1  ML1 NL 122 468  58 131  27   6 13  69  2  2 28 39  NA   3  6  4   13   590 469
2  aaronha01 1955     1  ML1 NL 153 602 105 189  37   9 27 106  3  1 49 61   5   3  7  4   20   755 603
3  aaronha01 1956     1  ML1 NL 153 609 106 200  34  14 26  92  2  4 37 54   6   2  5  7   21   762 610
4  aaronha01 1957     1  ML1 NL 151 615 118 198  27   6 44 132  1  1 57 58  15   0  0  3   13   766 616
5  aaronha01 1958     1  ML1 NL 153 601 109 196  34   4 30  95  4  1 59 49  16   1  0  3   21   754 602
6  aaronha01 1959     1  ML1 NL 154 629 116 223  46   7 39 123  8  0 51 54  17   4  0  9   19   783 630
7  aaronha01 1960     1  ML1 NL 153 590 102 172  20  11 40 126 16  7 60 63  13   2  0 12    8   743 591
8  aaronha01 1961     1  ML1 NL 155 603 115 197  39  10 34 120 21  9 56 64  20   2  1  9   16   758 604
9  aaronha01 1962     1  ML1 NL 156 592 127 191  28   6 45 128 15  7 66 73  14   3  0  6   14   748 593
10 aaronha01 1963     1  ML1 NL 161 631 121 201  29   4 44 130 31  5 78 94  18   0  0  5   11   792 632

答案 1 :(得分:-1)

问题是您在id来电中尝试修改summarize,但是您已在id上进行了分组。

从您的示例来看,无论如何,您似乎想要mutate。如果您要申请的函数会返回summarizesum这样的单一值,则可以使用mean

games2 = baseball %>%
  dplyr::group_by(id, year) %>%
  dplyr::mutate(
  total = g + ab, 
  a = ab + 1
  ) %>%
  dplyr::select(id, year, total, a) %>%
  dplyr::arrange(desc(total)) %>%
  head(10)

Source: local data frame [10 x 4]
Groups: id, year

          id year total   a
1  aaronha01 1954   590 469
2  aaronha01 1955   755 603
3  aaronha01 1956   762 610
4  aaronha01 1957   766 616
5  aaronha01 1958   754 602
6  aaronha01 1959   783 630
7  aaronha01 1960   743 591
8  aaronha01 1961   758 604
9  aaronha01 1962   748 593
10 aaronha01 1963   792 632