我想建立NordPool市场每小时电价的每日平均值。我正在使用aggregate()
包中的timeSeries
方法来构建每小时数据的每日均值,我已将其转换为timeSeries
对象。这是前72小时的dput()
:
> dput(tstSeries)
new("timeSeries"
, .Data = structure(c(31.05, 30.47, 28.92, 27.88, 26.96, 27.84, 28.79,
28.63, 28.44, 28.3, 30.65, 31.55, 32.16, 32.45, 32.63, 33.65,
34.9, 36.22, 36.65, 36.37, 35.49, 34.41, 34.66, 32.55, 33.15,
32.66, 31.83, 31.47, 32.56, 34.36, 36.28, 38.39, 39.09, 38.33,
38.42, 38.25, 37.96, 37.89, 37.88, 38.78, 39.83, 39.91, 39.32,
38.49, 37.46, 36.94, 36.37, 34.59, 33.11, 32.22, 31.46, 31.67,
32.05, 33.67, 34.93, 35.82, 36.38, 36.52, 36.71, 36.6, 36.51,
36.4, 36.42, 36.58, 36.94, 36.94, 36.81, 36.43, 35.91, 35.45,
34.77, 32.09), .Dim = c(72L, 1L), .Dimnames = list(NULL, "TS.1"))
, units = "TS.1"
, positions = c(1356998400, 1357002000, 1357005600, 1357009200, 1357012800,
1357016400, 1357020000, 1357023600, 1357027200, 1357030800, 1357034400,
1357038000, 1357041600, 1357045200, 1357048800, 1357052400, 1357056000,
1357059600, 1357063200, 1357066800, 1357070400, 1357074000, 1357077600,
1357081200, 1357084800, 1357088400, 1357092000, 1357095600, 1357099200,
1357102800, 1357106400, 1357110000, 1357113600, 1357117200, 1357120800,
1357124400, 1357128000, 1357131600, 1357135200, 1357138800, 1357142400,
1357146000, 1357149600, 1357153200, 1357156800, 1357160400, 1357164000,
1357167600, 1357171200, 1357174800, 1357178400, 1357182000, 1357185600,
1357189200, 1357192800, 1357196400, 1357200000, 1357203600, 1357207200,
1357210800, 1357214400, 1357218000, 1357221600, 1357225200, 1357228800,
1357232400, 1357236000, 1357239600, 1357243200, 1357246800, 1357250400,
1357254000)
, format = "%Y-%m-%d %H:%M:%S"
, FinCenter = "GMT"
, recordIDs = structure(list(), .Names = character(0), row.names = integer(0), class = "data.frame")
, title = "Time Series Object"
, documentation = "Wed May 20 11:02:09 2015"
)
要进行平均,我会执行以下操作:
## daily averaging
bydaily = timeSequence(from = start(tstSeries), to = end(tstSeries), by = "day")
tstSeries.daily = aggregate(tstSeries, by = bydaily, FUN = mean)
我得到的输出是:
tstSeries.daily
>GMT
TS.1
2013-01-01 31.05000
2013-01-02 31.82167
2013-01-03 36.67375
这里,第一个每日平均值是原始数据点!我在Excel中执行了相同的计算并确认在平均操作中没有考虑第一个数据点,而是将2013-01-02的平均值计算为2013-01-01 01:00到2013的平均值 - 01-02 00:00
我看过几个例子,展示了aggregate()
的使用,但没有遇到任何提出这个问题的例子。有没有人看到这种情况发生并有解决方法?
答案 0 :(得分:0)
这是一个返回所需输出的解决方案。它取决于apply.rolling
包中的PerformanceAnalytics
函数。
tstSeries.daily<-apply.rolling(tstSeries,width=24,by=24, FUN="mean") # get the mean of each of the 24 hours intervals.
tstSeries.daily<-tstSeries.daily[complete.cases(tstSeries.daily),] # remove rows with NAs.
rownames(tstSeries.daily)<-as.Date(rownames(tstSeries.daily)) # remove the time part of the index.
print(tstSeries.daily)
GMT
calcs
2013-01-01 31.73417
2013-01-02 36.67542
2013-01-03 35.09958