我想从数据框中创建一个ts()
对象,以预测物理现象。
我的数据在1年内(从2018年1月1日到2018年12月31日)有30分钟的频率,而且我观察到我的数据具有1天的季节性。
> head(pleiadesGH.v2[,c("time", "humExt.R", "tempExt", "radExt", "vientoVelo")])
time humExt.R tempExt radExt vientoVelo
1 2018-01-01 00:00:00 NA NA NA NA
2 2018-01-01 00:30:00 36.78287 16.95125 -10.08125 3.68550
3 2018-01-01 01:00:00 38.56775 16.26350 -9.75000 2.38420
4 2018-01-01 01:30:00 38.76425 15.63470 -10.08125 2.71915
5 2018-01-01 02:00:00 39.61575 15.32030 -10.41250 3.70475
6 2018-01-01 02:30:00 37.48700 15.06485 -10.74375 2.51895
基于此答案:
https://robjhyndman.com/hyndsight/seasonal-periods/
time series with 10 min frequency in R
我得出结论,ts()
的频率应该是48,因为1天有48次观察。
ts.freq1 <- ts(data = pleiadesGH.v2[,2:ncol(pleiadesGH.v2)],
start = c(2018),
frequency = 48)
但是结果ts()的时间索引错误,如下所示。时间数据应为2018年至2019年,而不是2400年。
Time Series:
Start = c(2018, 1)
End = c(2383, 1)
Frequency = 48
humInt.R humInt.E tempInt tempMac humExt.R humExt.E radExt tempExt vientoVelo
2018.000 NA NA NA NA NA NA NA NA NA
2018.021 NA NA NA NA 36.78287 0.004410894 -10.08125 16.95125 3.6855000
2018.042 NA NA NA NA 38.56775 0.004427114 -9.75000 16.26350 2.3842000
2018.062 NA NA NA NA 38.76425 0.004273306 -10.08125 15.63470 2.7191500
2018.083 NA NA NA NA 39.61575 0.004280005 -10.41250 15.32030 3.7047500
2018.104 NA NA NA NA 37.48700 0.003982139 -10.74375 15.06485 2.5189500
2018.125 NA NA NA NA 35.84950 0.003735063 -10.41250 14.77010 3.2235000
2018.146 NA NA NA NA 36.68462 0.003697674 -8.75625 14.25920 1.4409500
2018.167 NA NA NA NA 41.48250 0.003954404 -11.07500 13.39460 1.5064000
2018.188 NA NA NA NA 42.54688 0.003968433 -9.41875 13.06055 3.6701000
2018.208 NA NA NA NA 43.05450 0.003969581 -9.08750 12.88370 1.6103500
2018.229 NA NA NA NA 44.11888 0.004000366 -9.41875 12.62825 1.3485500
2018.250 NA NA NA NA 46.26400 0.004061953 -9.08750 12.13700 1.9491500
2018.271 NA NA NA NA 46.88625 0.004084874 -9.08750 12.01910 2.0569500
2018.292 NA NA NA NA 49.57175 0.004187059
我也尝试过以下频率:
ts.freq1 <- ts(data = pleiadesGH.v2[,2:ncol(pleiadesGH.v2)],
start = c(2018),
frequency = 365.25*24*60/30 )
得到以下结果:
Time Series:
Start = c(2018, 1)
End = c(2018, 17521)
Frequency = 17532
humInt.R humInt.E tempInt tempMac humExt.R humExt.E radExt tempExt vientoVelo
2018.000 NA NA NA NA NA NA NA NA NA
2018.000 NA NA NA NA 36.78287 0.004410894 -10.08125 16.95125 3.6855000
2018.000 NA NA NA NA 38.56775 0.004427114 -9.75000 16.26350 2.3842000
2018.000 NA NA NA NA 38.76425 0.004273306 -10.08125 15.63470 2.7191500
2018.000 NA NA NA NA 39.61575 0.004280005 -10.41250 15.32030 3.7047500
2018.000 NA NA NA NA 37.48700 0.003982139 -10.74375 15.06485 2.5189500
2018.000 NA NA NA NA 35.84950 0.003735063 -10.41250 14.77010 3.2235000
2018.000 NA NA NA NA 36.68462 0.003697674 -8.75625 14.25920 1.4409500
2018.000 NA NA NA NA 41.48250 0.003954404 -11.07500 13.39460 1.5064000
2018.001 NA NA NA NA 42.54688 0.003968433 -9.41875 13.06055 3.6701000
2018.001 NA NA NA NA 43.05450 0.003969581 -9.08750 12.88370 1.6103500
2018.001 NA NA NA NA 44.11888 0.004000366 -9.41875 12.62825 1.3485500
2018.001 NA NA NA NA 46.26400 0.004061953 -9.08750 12.13700 1.9491500
但这暗含着我的季节性是每年一次,但这不是我的目标。在下面的图片中,尽管季节性错误,您仍可以确定时间索引已固定
good index incorrect seasonality
我在做什么错了?
答案 0 :(得分:0)
解决方案如下:
freq.daily <- 48 # 24 hours * 2 obs per hour
ts.daily <- ts(data = pleiadesGH.v2.interp[,2:ncol(pleiadesGH.v2)],
start = c(1),
frequency = freq.daily)
Time Series:
Start = c(1, 1)
End = c(366, 1)
Frequency = 48
humInt.R humInt.E tempInt tempMac humExt.R humExt.E radExt
1.000000 74.56250 0.007699896 14.53500 13.625000 36.78287 0.004410894 -10.081250
1.020833 74.56250 0.007699896 14.53500 13.625000 36.78287 0.004410894 -10.081250
1.041667 74.56250 0.007699896 14.53500 13.625000 38.56775 0.004427114 -9.750000
1.062500 74.56250 0.007699896 14.53500 13.625000 38.76425 0.004273306 -10.081250
1.083333 74.56250 0.007699896 14.53500 13.625000 39.61575 0.004280005 -10.412500
1.104167 74.56250 0.007699896 14.53500 13.625000 37.48700 0.003982139 -10.743750
这是从1开始简单有效地管理日期的方式。