当并非所有组合都存在时,使用ddply进行总结

时间:2014-02-13 23:03:55

标签: r plyr

我正在尝试计算网站和变量组合在每个月发生的观察次数。例如,我的数据格式为

station variable date       month
A         v1      1/1/2011    1
A         v1      1/1/2012    1

所以,我希望ddply的输出为:

station  variable   Jan
A        v1          2

这是我对ddply代码的第一次尝试:

months <- ddply(seasons, c("station", "variable"), summarize,
              Jan = length(month=1),
              Feb = length(month=2),
              Mar = length(month=3),
              Apr = length(month=4),
              May = length(month=5),
              Jun = length(month=6),
              Jul = length(month=7),
              Aug = length(month=8),
              Sep = length(month=9),
              Oct = length(month=10),
              Nov = length(month=11),
              Dec = length(month=12))

但是,对于某些电台/变量组合,从未对一个月进行采样。因此,例如,如果站点B在1月份从未采样过,我收到错误:

Error in length(month = 1) : 
  supplied argument name 'month' does not match 'x'

如果month = x不存在,我尝试使用ifelse()语句将值设置为零,但我没有尝试过。我也尝试用零填充数据帧“月”,但这也没有用。

连连呢?谢谢。

1 个答案:

答案 0 :(得分:3)

## make some data like yours
set.seed(1)
dat <- seq(as.POSIXct(42, origin = "1990-01-01"), Sys.time(), length.out = 100)
seasons <- data.frame(
  station = sample(LETTERS[1:10], length(dat), TRUE),
  variable = paste0("v", sample(1:5, length(dat), TRUE)),
  date = dat,
  month = as.integer(format(dat, "%m"))
  )

head(seasons)
##   station variable                date month
## 1       C       v4 1989-12-31 19:00:42    12
## 2       D       v2 1990-03-30 18:45:47     3
## 3       F       v2 1990-06-27 19:30:52     6
## 4       J       v5 1990-09-24 19:15:57     9
## 5       C       v4 1990-12-22 18:01:02    12
## 6       I       v2 1991-03-21 17:46:07     3

library(plyr)

out <- ddply(seasons, .(station, variable), function(x)
             table(factor(x$month, levels = 1:12, labels = month.abb)))

head(out)
##   station variable Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
## 1       A       v1   0   0   1   0   1   0   0   0   0   0   0   0
## 2       A       v2   0   0   1   0   0   0   0   0   0   0   0   0
## 3       A       v3   0   1   1   0   1   0   0   0   0   0   0   0
## 4       A       v4   0   0   0   0   0   0   1   0   0   0   0   0
## 5       B       v1   0   0   0   0   0   0   0   1   0   0   0   0
## 6       B       v3   1   0   0   0   0   0   0   0   1   0   0   0

感谢@Henrik提供month.abb技巧