在dplyr中动态分配由另一列分组的列的最大值

时间:2018-05-16 07:37:48

标签: r dplyr

我有一个数据框,我想在其中分配另一个静态列上分组的动态命名列的最大值。我认为表达问题的最佳方式是通过一个例子。

假设我有以下数据框// Button.test.js import React from 'react'; import { shallow } from 'enzyme'; import Button from '../../components/Button/Button'; import { StyledButton } from './styled' describe('Component: Button', () => { const minProps = { text: '', size: '', }; it('renders a button in size of "small" with text in it', () => { const wrapper = shallow( <Button {...minProps} size="small" text="Join us" /> ); expect(wrapper.find(StyledButton).prop('size')).toBe('small'); expect(wrapper.find(StyledButton).prop('text')).toBe('Join us'); }); });

my.events

x title typethx typesea 1 2016-11-24 Thanksgiving 1 0 2 2016-11-25 Thanksgiving 2 0 3 2016-11-26 Thanksgiving 3 0 4 2016-11-26 Season 0 1 5 2016-11-27 Season 0 2 上的两种事件类型之间的日期重叠。因此,我希望按2016-11-26列进行分组,然后将x列变为最大值。

在静态实现中,这将写为:

type

结果如下:

my.events <- my.events %>%
  group_by(x) %>%
  mutate(typethx = max(typethx),
         typesea = max(typesea)) %>%
  ungroup()

但是,我想动态改变我的 x title typethx typesea 1 2016-11-24 Thanksgiving 1 0 2 2016-11-25 Thanksgiving 2 0 3 2016-11-26 Thanksgiving 3 1 4 2016-11-26 Season 3 1 5 2016-11-27 Season 0 2 列。我首先尝试动态更改一个type列。在这种情况下,让我们说我想在type上应用我的变异,所以我创建了一个变量typethx。现在,我已经使用name = "typethx"mutate_方法尝试了SE方法。它们都没有成功,导致错误或输出错误(请参阅下面的尝试)。

尝试A:

lazyeval

结果A:

new.events <- my.events %>%
  group_by(x) %>%
  mutate(!!name := max(!!name)) %>%
  ungroup()

尝试B:

           x        title typethx typesea
      <fctr>       <fctr>   <chr>   <dbl>
1 2016-11-24 Thanksgiving typethx       0
2 2016-11-25 Thanksgiving typethx       0
3 2016-11-26 Thanksgiving typethx       1
4 2016-11-26       Season typethx       1
5 2016-11-27       Season typethx       2

结果B:

new.events <- my.events %>%
  group_by(x) %>%
  mutate_(lazyeval::interp(~name = max(name), name = as.name(name))) %>%
  ungroup()

尝试C:

Error: unexpected '=' in "new.events <- my.events %>% group_by(x) %>% mutate_(lazyeval::interp(~name ="

结果C:

new.events <- my.events %>%
  group_by(x) %>%
  mutate_(lazyeval::interp(~name, name = as.name) = lazyeval::interp(~max(name), name = as.name(name))) %>%
  ungroup()

尝试D:

Error: unexpected '=' in "new.events <- my.events %>% group_by(x) %>% mutate_(lazyeval::interp(~name, name = as.name) ="

结果D:

new.events <- my.events %>%
  group_by(x) %>% mutate_(name = lazyeval::interp(~max(name), name = as.name(name))) %>%
  ungroup()

加分:

我正在考虑循环浏览我的 x title typethx typesea name <fctr> <fctr> <dbl> <dbl> <dbl> 1 2016-11-24 Thanksgiving 1 0 1 2 2016-11-25 Thanksgiving 2 0 2 3 2016-11-26 Thanksgiving 3 1 3 4 2016-11-26 Season 3 1 3 5 2016-11-27 Season 0 2 0 列并进行变异,但如果有一种方法可以同时改变所有这些,那么那将是很好的。仅仅为了先验知识,这些type列是我在前一步中创建的虚拟变量列。为了保留问题的范围,您可以放心地假设有一个名为type的变量。

1 个答案:

答案 0 :(得分:1)

如果我们需要在多列上申请,请使用mutate_at

my.events %>% 
     group_by(x) %>% 
     mutate_at(vars(starts_with("type")), max)
# A tibble: 5 x 4
# Groups:   x [4]
#   x          title        typethx typesea
#   <date>     <chr>          <dbl>   <dbl>
#1 2016-11-24 Thanksgiving       1       0
#2 2016-11-25 Thanksgiving       2       0
#3 2016-11-26 Thanksgiving       3       1
#4 2016-11-26 Season             3       1
#5 2016-11-27 Season             0       2