如何对数据进行分组,保持标识列不变

时间:2017-08-07 09:57:36

标签: r grouping

我在R中有一个数据集,如下所示。

date          jobcategory
2016-01-01     SP    
2016-01-01     DP   
2016-01-01     SP   
2016-01-01     CP   
2016-01-01     DP   
2016-01-01     DP   
2016-01-01     DP   
2016-01-02     SP   
2016-01-02     CP   
2016-01-02     SP   
2016-01-02     CP   
2016-01-02     DP   
2016-01-02     TP   
2016-01-02     DP   
2016-01-02     DP   
2016-01-02     DP   
2016-01-03     SP   
2016-01-03     SP   
2016-01-03     DP   
2016-01-03     DP   
2016-01-03     SP   
2016-01-03     DP   
2016-01-04     CP   
2016-01-04     MP       

我尝试将这些数据分组以保持日期字段的唯一性,同时获取第二列中某个作业类别的计数,如下所示:

date      jobcategory   Count
2016-01-01     SP       2
2016-01-02     SP       2
2016-01-03     SP       3
2016-01-04     SP       0

非常感谢任何帮助。

3 个答案:

答案 0 :(得分:1)

具有table的基础R解决方案。

> dat <- as.data.frame(table(dat))
> dat <- dat[dat$jobcategory=='SP', ]
> dat


         date jobcategory Freq
13 2016-01-01          SP    2
14 2016-01-02          SP    2
15 2016-01-03          SP    3
16 2016-01-04          SP    0

数据

dat <- 
structure(list(date = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 
4L), .Label = c("2016-01-01", "2016-01-02", "2016-01-03", "2016-01-04"
), class = "factor"), jobcategory = structure(c(4L, 2L, 4L, 1L, 
2L, 2L, 2L, 4L, 1L, 4L, 1L, 2L, 5L, 2L, 2L, 2L, 4L, 4L, 2L, 2L, 
4L, 2L, 1L, 3L), .Label = c("CP", "DP", "MP", "SP", "TP"), class = "factor")),
.Names = c("date", "jobcategory"), class = "data.frame", row.names = c(NA, -24L))

答案 1 :(得分:0)

我们需要The color of apple is red apple grape banana red purple yellow 缺少组合,然后获得&#39; Count&#39;

complete

答案 2 :(得分:0)

来自基地R的单线,

sapply(unique(df$date), function(i) 
                        length(df$jobcategory[df$jobcategory == 'SP' & i == df$date]))

#[1] 2 2 3 0