我正在尝试计算网站和变量组合在每个月发生的观察次数。例如,我的数据格式为
station variable date month
A v1 1/1/2011 1
A v1 1/1/2012 1
等
所以,我希望ddply
的输出为:
station variable Jan
A v1 2
这是我对ddply代码的第一次尝试:
months <- ddply(seasons, c("station", "variable"), summarize,
Jan = length(month=1),
Feb = length(month=2),
Mar = length(month=3),
Apr = length(month=4),
May = length(month=5),
Jun = length(month=6),
Jul = length(month=7),
Aug = length(month=8),
Sep = length(month=9),
Oct = length(month=10),
Nov = length(month=11),
Dec = length(month=12))
但是,对于某些电台/变量组合,从未对一个月进行采样。因此,例如,如果站点B在1月份从未采样过,我收到错误:
Error in length(month = 1) :
supplied argument name 'month' does not match 'x'
如果month = x不存在,我尝试使用ifelse()
语句将值设置为零,但我没有尝试过。我也尝试用零填充数据帧“月”,但这也没有用。
连连呢?谢谢。
答案 0 :(得分:3)
## make some data like yours
set.seed(1)
dat <- seq(as.POSIXct(42, origin = "1990-01-01"), Sys.time(), length.out = 100)
seasons <- data.frame(
station = sample(LETTERS[1:10], length(dat), TRUE),
variable = paste0("v", sample(1:5, length(dat), TRUE)),
date = dat,
month = as.integer(format(dat, "%m"))
)
head(seasons)
## station variable date month
## 1 C v4 1989-12-31 19:00:42 12
## 2 D v2 1990-03-30 18:45:47 3
## 3 F v2 1990-06-27 19:30:52 6
## 4 J v5 1990-09-24 19:15:57 9
## 5 C v4 1990-12-22 18:01:02 12
## 6 I v2 1991-03-21 17:46:07 3
library(plyr)
out <- ddply(seasons, .(station, variable), function(x)
table(factor(x$month, levels = 1:12, labels = month.abb)))
head(out)
## station variable Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
## 1 A v1 0 0 1 0 1 0 0 0 0 0 0 0
## 2 A v2 0 0 1 0 0 0 0 0 0 0 0 0
## 3 A v3 0 1 1 0 1 0 0 0 0 0 0 0
## 4 A v4 0 0 0 0 0 0 1 0 0 0 0 0
## 5 B v1 0 0 0 0 0 0 0 1 0 0 0 0
## 6 B v3 1 0 0 0 0 0 0 0 1 0 0 0
感谢@Henrik提供month.abb
技巧