每个标签的年份总和数据帧

时间:2019-03-18 14:55:28

标签: r

具有这样的数据结构:

dtest <- data.frame(label=c("yahoo","google","yahoo","yahoo","google","google","yahoo","yahoo"), year=c(2000,2001,2000,2001,2003,2003,2003,2003))

如何提取像这样的新数据框:

doutput <- data.frame(label=c("yahoo","yahoo","yahoo","yahoo","google","google","google","google"), year=c(2000,2001,2002,2003,2000,2001,2002,2003), volume=c(2,1,0,3,0,1,0,2))

> doutput
   label year volume
1  yahoo 2000      2
2  yahoo 2001      1
3  yahoo 2002      0
4  yahoo 2003      3
5 google 2000      0
6 google 2001      1
7 google 2002      0
8 google 2003      2

2 个答案:

答案 0 :(得分:1)

一种方法是使用dplyr

library(dplyr)

dtest %>%
  group_by(label, year) %>%
  tally(name = "volume")

# A tibble: 5 x 3
# Groups:   label [2]
  label   year volume
  <fct>  <dbl>  <int>
1 google  2001      1
2 google  2003      2
3 yahoo   2000      2
4 yahoo   2001      1
5 yahoo   2003      2

答案 1 :(得分:1)

这是一个以R为底的解决方案:

as.data.frame(table(transform(dtest,
                              year = factor(year, levels = seq(min(year), max(year))))))

结果:

   label year Freq
1 google 2000    0
2  yahoo 2000    2
3 google 2001    1
4  yahoo 2001    1
5 google 2002    0
6  yahoo 2002    0
7 google 2003    2
8  yahoo 2003    2