我是“R”的新用户,我找不到解决它的好方法。我有以下格式的时间序列:
>dates temperature depth salinity
>12/03/2012 11:26 9.7533 0.48073 37.607
>12/03/2012 11:56 9.6673 0.33281 37.662
>12/03/2012 12:26 9.6673 0.33281 37.672
我的变量测量频率不规律,每15分钟或每30分钟完成一次,具体取决于时间段。我想计算每个变量的年度,月度和日均值,无论一天/月/年的数据是多少。我读了很多关于包动物园,时间序列,xts等的东西,但是我无法清楚地了解我的内容(可能因为我对R ...不够熟练)。
我希望我的帖子很清楚,如果不是,请不要犹豫告诉我。
答案 0 :(得分:6)
将您的数据转换为xts对象,然后使用apply.daily
等计算您想要的任何值。
library(xts)
d <- structure(list(dates = c("12/03/2012 11:26", "12/03/2012 11:56",
"12/03/2012 12:26"), temperature = c(9.7533, 9.6673, 9.6673),
depth = c(0.48073, 0.33281, 0.33281), salinity = c(37.607,
37.662, 37.672)), .Names = c("dates", "temperature", "depth",
"salinity"), row.names = c(NA, -3L), class = "data.frame")
x <- xts(d[,-1], as.POSIXct(d[,1], format="%m/%d/%Y %H:%M"))
apply.daily(x, colMeans)
# temperature depth salinity
# 2012-12-03 12:26:00 9.695967 0.3821167 37.647
答案 1 :(得分:3)
我将日,月和年添加到数据框中,然后使用aggregate()
。
首先将您的date
列转换为POSIXct对象:
d$timestamp <- as.POSIXct(d$dates,format = "%m/%d/%Y %H:%M",tz ="GMT")
然后将日期(例如12/03/2012)放入名为Date
的列中,试试这个:
d$Date <- format(d$timestamp,"%y-%m-%d",tz = "GMT")
接下来,按日期汇总:
aggregate(cbind("temperature.mean" = temperature,
"salinity.mean" = salinity) ~ Date,
data = d,
FUN = mean)
同样,您可以将月份放入一列(让我们称之为M
一个月),然后......
d$M <- format(d$timestamp,"%B",tz = "GMT")
aggregate(cbind("temperature.mean" = temperature,
"salinity.mean" = salinity) ~ M,
data = d,
FUN = mean)
或者如果你想要年月
d$YM <- format(d$timestamp,"%y-%B",tz = "GMT")
aggregate(cbind("temperature.mean" = temperature,
"salinity.mean" = salinity) ~ YM,
data = d,
FUN = mean)
如果您的数据中包含任何NA值,则可能需要考虑以下因素:
aggregate(cbind("temperature.mean" = temperature,
"salinity.mean" = salinity) ~ YM,
data = d,
function(x) mean(x,na.rm = TRUE))
最后,如果你想按周平均,你也可以这样做。首先生成周数,然后再次使用aggregate()
。
d$W <- format(d$timestamp,"%W",tz = "GMT")
aggregate(cbind("temperature.mean" = temperature,
"salinity.mean" = salinity) ~ W,
data = d,
function(x) mean(x,na.rm = TRUE))
此版本的周数将第1周定义为一年中第一个星期一的一周。这周是从周一到周日。
答案 2 :(得分:1)
然而,另一种使用plyr的方法:
df <- structure(list(dates = c("12/03/2012 11:26", "12/03/2012 11:56",
"12/03/2012 12:26"), temperature = c(9.7533, 9.6673, 9.6673),
depth = c(0.48073, 0.33281, 0.33281), salinity = c(37.607,
37.662, 37.672)), .Names = c("dates", "temperature", "depth",
"salinity"), row.names = c(NA, -3L), class = "data.frame")
library(plyr)
# Change date to POSIXct
df$dates <- with(d,as.POSIXct(dates,format="%m/%d/%Y %H:%M"))
# Make new variables, year and month
df <- transform(d,month=as.numeric(format(dates,"%m")),year=as.numeric(format(dates,"%Y")))
## According to year
ddply(df,.(year),summarize,meantemp=mean(temperature),meandepth=mean(depth),meansalinity=mean(salinity))
year meantemp meandepth meansalinity
1 2012 9.695967 0.3821167 37.647
## According to month
ddply(df,.(month),summarize,meantemp=mean(temperature),meandepth=mean(depth),meansalinity=mean(salinity))
month meantemp meandepth meansalinity
1 12 9.695967 0.3821167 37.647
答案 3 :(得分:1)
包hydroTSM
包含多个函数来创建年度摘要和其他摘要:
daily2annual(x, ...)
subdaily2annual(x, ...)
monthly2annual(x, ...)
annualfunction(x, FUN, na.rm = TRUE, ...)