使用mutate()有效地创建数据框

时间:2014-12-16 04:20:50

标签: r dplyr

我有这个本地数据框:

Source: local data frame [792 x 3]

         team       player_name  g
1     Anaheim       PERRY_COREY 31
2     Anaheim      GETZLAF_RYAN 22
3      Dallas        BENN_JAMIE 25
4  Pittsburgh     CROSBY_SIDNEY 20
5     Toronto       KESSEL_PHIL 27
6    Edmonton       HALL_TAYLOR 16
7      Dallas      SEGUIN_TYLER 24
8    Montreal      VANEK_THOMAS 19
9    Colorado LANDESKOG_GABRIEL 18
10    Chicago     SHARP_PATRICK 22
..        ...               ... ..

我希望能够根据每位球员的平均进球数(g)对球队进行排名。这就是我所做的(真的感觉不是最理想的):

    library(dplyr)
      d1 <- select(df, team, g, player_name) 
      c1 <- count(d1, team, wt = g) 
      c2 <- count(d1, team, wt = n_distinct(player_name)) 
      c3 <- cbind(c1, c2[,2]) 
      c4 <- c3[,2] / c3[,3]
      c5 <- cbind(c3, c4)
      colnames(c5) <- c("team", "ttgpt", "ttnp", "agpp")
      c6 <- mutate(c5, rank = row_number(desc(c4)))
      c7 <- filter(c6, rank <=10)
      c8 <- arrange(c7, rank)

这是c8的结果:

           team ttgpt ttnp     agpp rank
1       Chicago   177   23 7.695652    1
2      Colorado   164   23 7.130435    2
3       Anaheim   180   26 6.923077    3
4    NY_Rangers   153   23 6.652174    4
5        Boston   179   27 6.629630    5
6      San_Jose   157   25 6.280000    6
7        Dallas   155   25 6.200000    7
8     St._Louis   148   24 6.166667    8
9        Ottawa   160   26 6.153846    9
10 Philadelphia   140   23 6.086957   10

我想重新使用%>%

重新创建此表

有关可复制的示例,请参阅CSV:playerstats.csv

1 个答案:

答案 0 :(得分:3)

好的,你说的话:

df<-read.csv("../Downloads/playerstats.csv",header=T,sep=",")

df %>% group_by(Team)  
   %>% summarise(ttgp=sum(G),ttnp=n_distinct(Player.Name),agp=sum(G)/n_distinct(Player.Name))
   %>% mutate(rank=rank(desc(agp))) 
   %>% filter(rank<=10) 
   %>% arrange(rank)

        Source: local data frame [10 x 5]

           Team ttgp ttnp      agp rank
1       Chicago  177   23 7.695652    1
2      Colorado  164   23 7.130435    2
3       Anaheim  180   26 6.923077    3
4    NY Rangers  153   23 6.652174    4
5        Boston  179   27 6.629630    5
6      San Jose  157   25 6.280000    6
7        Dallas  155   25 6.200000    7
8     St. Louis  148   24 6.166667    8
9        Ottawa  160   26 6.153846    9
10 Philadelphia  140   23 6.086957   10

请注意,我不确定你对ttgpt和ttnp的意思。因此,我试着猜测它。