我可以使用dplyr :: mutate在不同的行上运行吗?

时间:2016-09-01 18:48:19

标签: r dplyr

我有一个数据集,我包括下面相关列的一小部分,

year ID type result  
2003 1   new        closed  
2003 2   new        transferred  
2003 3   subsequent closed  
2003 4   subsequent diverted  
....  
2015 1000 new       closed

我想要计算的是次要因素的比例,(根据年份和结果分组的子句数/(no.subsequents + no。of news),如下所示:

year result subsequent_frac  
2003 closed 0.10  
2003 transferred 0.05  
2003 ....  
....  
2015 closed 0.05  
2015 transferred 0.1  

我知道我可以分步进行,使用group_by和摘要来获取计数并分别执行每个结果....我想知道是否有更简洁/更快的方法来执行此操作。

1 个答案:

答案 0 :(得分:1)

这是你在找什么?应用汇总会删除一个级别的分组,因此会删除第二个group_by。

dfSummarized <- group_by(df, year, type) %>% 
            summarise(subsequent_frac = n()) %>% 
            #group_by(type) %>% # maybe you don't need this?
            mutate(freq = subsequent_frac / sum(subsequent_frac))