使用dplyr获得相对频率百分比的更智能方法?

时间:2014-07-16 07:29:39

标签: r dplyr frequency

学习R作为一种爱好,并使用我的一些炉石匹配

背景:使用dplyr:

todayhs %.%
   group_by(hero, result) %.%
   select(hero, opponent, result) %.%
   summarise(
     count = n())

数据

hero      result    count   
Mage      loss      12 
Mage      win       9 
Rogue     loss      3                  
Rogue     win       1                  
Warrior   loss      6                  
Warrior   win       5                  

预期结果:该特定英雄的百分比列

hero      result    count   percent
Mage      loss      12      57%
Mage      win       9       43% 
Rogue     loss      3       75%           
Rogue     win       1       25%           
Warrior   loss      6       55%           
Warrior   win       5       45% 

我的障碍
我理解filter(hero =" Mage")并使用prop.table将获得该类别百分比的结果,但有没有办法立即获得上述所有数据?

我的尝试

 transform(todayhs.mage, percents = ifelse(hero == "Mage",    
 prop.table(todayhs.mage$count[1:2]),""))          

给我

 hero      result count      percents
 Mage      loss    12        0.571428571428571
 Mage      win     9         0.428571428571429
 Rogue     loss    3                  
 Rogue     win     1                  
 Warrior   loss    6                  
 Warrior    win    5      

我认为我可以编写一个函数并单独剥离它们......但这感觉不对。 也许有更好的方法使用dplyr添加group_by(英雄,计数)?我在这里挠头。

2 个答案:

答案 0 :(得分:3)

你可以尝试:

todayhs <- read.table(text="hero      result    count   
Mage      loss      12 
Mage      win       9 
Rogue     loss      3                  
Rogue     win       1                  
Warrior   loss      6                  
Warrior   win       5",sep="",header=T,stringsAsFactors=F)    

 library(dplyr)
 todayhs%>%
 group_by(hero)%>%
 mutate(percent=paste0(round(100*count/sum(count)),"%"))
# Source: local data frame [6 x 4]
 #Groups: hero

 #     hero result count percent
 # 1    Mage   loss    12     57%
 # 2    Mage    win     9     43%
 # 3   Rogue   loss     3     75%
 # 4   Rogue    win     1     25%
 # 5 Warrior   loss     6     55%
 # 6 Warrior    win     5     45%

答案 1 :(得分:3)

或使用data.table(因为您没有说它必须是dplyr解决方案)

library(data.table)
setDT(todayhs)[, Percent := paste0(round(count/sum(count)*100), "%"), by = hero]

#       hero result count Percent
# 1:    Mage   loss    12     57%
# 2:    Mage    win     9     43%
# 3:   Rogue   loss     3     75%
# 4:   Rogue    win     1     25%
# 5: Warrior   loss     6     55%
# 6: Warrior    win     5     45%