在tidyverse中对参数进行分组:在缺少级别的变量中按连续的级别数进行分组

时间:2017-10-14 05:02:05

标签: r tidyverse

我有一个关于草食动物寄生month的月度数据result的数据框,用于各种类型的寄生虫命令'psitorder',其水平为“膜翅目”或“双翅目”。结果是寄生草食动物的“p”,如果草食动物成长为“a”,或者没有数据,因为草食动物在圈养中死亡。

 df<-data.frame(month= c(rep(1, each=8), rep(2, each=6), 
                    rep(3, each=6), rep(4, each=8), 
                    rep(5,each=6),rep(6, each=6), 
                    rep(8,each=6),rep(9, each=6)),
                result= c(rep("p",each=3),rep("a",each=3), 
                     rep("",each=2),rep("p",each=3),rep("a",each=2), 
                     rep("",each=1),rep("a",each=3),rep("",each=3),
                     rep("p",each=3),rep("a",each=3),rep("",each=2),
                     rep("p",each=3),rep("a",each=2), 
                     rep("",each=1),rep("a",each=3),rep("",each=3),
                     rep("p",each=3),rep("a",each=3),rep("",each=2),         
                     rep("a",each=4)),
                 psitorder=c(rep("Hymenoptera",each=2),
                     rep("Diptera",each=1),rep("",each=5),
                     rep("Hymenoptera",each=1),rep("Diptera",each=3),
                     rep("",each=2),rep("",each=6),
                     rep("Hymenoptera",each=2),rep("Diptera",each=1),
                     rep("",each=5),rep("Hymenoptera",each=1),
                     rep("Diptera",each=3),rep("",each=2),
                     rep("",each=6),rep("Hymenoptera",each=2), 
                     rep("Diptera",each=1),rep("",each=9)))

我想按month变量进行分组,但是,我需要每隔3个月对数据进行分组。在这个例子中,month 1,2,3将被分组,月份为4,5,6,而对于第7,9个月,我需要添加缺少的月份8,以便继续使用连续的行计算psit_freq

分组后,我想使用以下方法计算psit_freq

我试过了:

output %>% 
group_by(month+3) %>% 
mutate(complete(continuous_month= seq(min(continuous_month), 
max(continuous_month), 1L))%>%
summarise(hym_freq = sum(psitorder == 'Hymenoptera')/sum(result %in% c('p', 'a')), 
          dip_freq = sum(psitorder == 'Diptera')/sum(result %in% c('p', 'a')))

输出如下:

output<- data.frame(group= c("1", "2", "3"), hym_psit= c(3/14, 
         3/14,2/10), dip_psit= c(4/14,4/14,1/10))

1 个答案:

答案 0 :(得分:0)

我们使用%/%

创建分组变量
data_frame(month = 1:9) %>% 
       full_join(., df) %>% 
       group_by(group = (month-1)%/%3 + 1) %>%
      summarise(hym_freq = sum(psitorder == 'Hymenoptera', na.rm = TRUE)/sum(result %in% c('p', 'a'), na.rm = TRUE), 
           dip_freq = sum(psitorder == 'Diptera', na.rm =TRUE)/sum(result %in% c('p', 'a'), na.rm = TRUE))
# A tibble: 3 x 3
#   group  hym_freq  dip_freq    
#    <dbl>     <dbl>     <dbl>
#1     1 0.2142857 0.2857143
#2     2 0.2142857 0.2857143
#3     3 0.2000000 0.1000000