使用新列将列计数转换为数据框

时间:2017-04-21 06:06:19

标签: r

我有一个棒球逐个播放数据的数据集。这是一个简化的例子:

team <- c('A','A','A','A','A','A','A',
      'B','B','B','B','B','B','B',
      'C','C','C','C','C','C','C')
event <- c("OUT","WALK","OUT","OUT","HR","WALK","OUT",
       "WALK","OUT","HR","WALK","OUT","OUT","WALK",
        "HR","HR","WALK","WALK","HR","OUT","WALK")
df <- data.frame(team, event)
df

   team event
1     A   OUT
2     A  WALK
3     A   OUT
4     A   OUT
5     A    HR
6     A  WALK
7     A   OUT
8     B  WALK
9     B   OUT
10    B    HR
11    B  WALK
12    B   OUT
13    B   OUT
14    B  WALK
15    C    HR
16    C    HR
17    C  WALK
18    C  WALK
19    C    HR
20    C   OUT
21    C  WALK

我想创建一个新数据框,显示每个团队每个事件发生的次数,每个事件由一个新列表示,如下所示:

  team OUT WALK HR
1    A   4    2  1
2    B   3    3  1
3    C   1    3  3

我认为必须有一种方法可以使用dplyr执行此操作,但我无法弄明白。

1 个答案:

答案 0 :(得分:1)

我们可以尝试使用dplyr/tidyr。获取count基于&#39;&#39;,&#39;事件&#39;和spread来自&#39; long&#39;广泛&#39;

library(tidyverse)
df %>%
   count(team, event) %>% 
   spread(event, n)
# A tibble: 3 × 4
#    team    HR   OUT  WALK
#* <fctr> <int> <int> <int>
#1      A     1     4     2
#2      B     1     3     3
#3      C     3     1     3

如果我们需要订购列,请转换&#39;事件&#39;将factor指定为&{39;事件&#39;的levels个元素的unique第一

df %>% 
  mutate(event = factor(event, levels = unique(event))) %>% 
  count(team, event) %>% 
  spread(event, n)
# A tibble: 3 × 4
#    team   OUT  WALK    HR
#*  <fctr> <int> <int> <int>
#1      A     4     2     1
#2      B     3     3     1
#3      C     1     3     3

dcast

中的data.table
library(data.table)
dcast(setDT(df), team~event, length)

来自table

base R
table(df)