R等传感器数据的时间序列分析

时间:2016-12-16 15:38:16

标签: r machine-learning time-series sensor kalman-filter

我有传感器数据的数据框

我的数据框如下:

pressure    datetime
4.848374    2016-04-12 10:04:00   
4.683901    2016-04-12 10:04:32   
5.237860    2016-04-12 10:13:20 

现在,我想应用ARIMA进行预测分析。

由于数据未统一采样,我在每小时基础上对其进行汇总,如下所示:

datetime                    pressure
"2016-04-19 00:00:00 BST"   5.581806
"2016-04-19 01:00:00 BST"   4.769832
"2016-04-19 02:00:00 BST"   4.769832  
"2016-04-19 03:00:00 BST"   4.553711  
"2016-04-19 04:00:00 BST"   6.285599  
"2016-04-19 05:00:00 BST"   5.873414

每小时的压力如下所示:

enter image description here

但我无法创建ts对象,因为我不确定每小时数据的频率应该是多少。

1 个答案:

答案 0 :(得分:2)

您的问题已在评论部分得到解答,但重申一下,您应将频率设置为24,因为您要预测每小时数据:

sensor = ts(hourlyPressure, frequency = 24)

关于修复绘图中日期的下一点,让我们从一些示例数据开始:

###Sequence of numbers to forecast    
hourlyPressure<-c(1:24, 12:36, 24:48, 36:60)
###Sequence of Accompanying Dates
dates<-seq(as.POSIXct("2016-04-19 00:00:00"), as.POSIXct("2016-04-23 02:00:00"), by="hour")

现在我们可以将hourlyPressure数据设置为ts()对象(让我们忽略一分钟的日期)

sensor <- ts(hourlyPressure, frequency=24)

现在适合你的arima模型,在这个例子中我将使用预测包中的auto.arima函数,因为找到最好的arima模型不是这里关注的焦点(尽管使用auto.arima()是一种非常强大的方式找到最适合你的数据的arima模型):

###fit arima model to sensor data
sensor_arima_fit<- auto.arima(sensor)

然后,只需在plot()函数

中指定x值,即可使用适当的日期绘制此数据
plot(y=sensor_arima_fit$x, x=dates)

当我们预测我们的数据并希望绘制原始数据,预测并确定日期时,更难一点。

###now forecast ahead (lets say 2 days) using the arima model that was fit above
forecast_sensor <- forecast(sensor_arima_fit, h = 48)

现在绘制原始数据,使用正确的日期进行预测,我们可以执行以下操作:

###set h to be the same as above
h <- c(48)
###calculate the dates for the forecasted values
forecasted_dates<-seq(dates[length(dates)]+(60*60)*(1), 
                  dates[length(dates)]+(60*60)*(h), by="hour")

###now plot the data
plot(y=c(forecast_sensor$x, forecast_sensor$mean), 
     x=seq(as.POSIXct("2016-04-19 00:00:00"),as.POSIXct(forecasted_dates[length(forecasted_dates)]), by="hour"),
     xaxt="n", 
     type="l", 
     main="Plot of Original Series and Forecasts", 
     xlab="Date", 
     ylab="Pressure")

###correctly formatted x axis
axis.POSIXct(1, at=seq(as.POSIXct("2016-04-19 00:00:00"), 
                       as.POSIXct(forecasted_dates[length(forecasted_dates)]), 
                       by="hour"), 
             format="%b %d", 
             tick = FALSE)

这将原始数据与预测绘制在一起,日期是正确的。但是,就像预测包提供的那样,也许我们希望预测是蓝色的。

###keep same plot as before
plot(y=c(forecast_sensor$x, forecast_sensor$mean), 
     x=seq(as.POSIXct("2016-04-19 00:00:00"),as.POSIXct(forecasted_dates[length(forecasted_dates)]), by="hour"),
     xaxt="n", 
     type="l", 
     main="Plot of Original Series and Forecasts", 
     xlab="Date", 
     ylab="Pressure")

axis.POSIXct(1, at=seq(as.POSIXct("2016-04-19 00:00:00"), 
                       as.POSIXct(forecasted_dates[length(forecasted_dates)]), 
                       by="hour"), 
             format="%b %d", 
             tick = FALSE)

###This time however, lets add a different color line for the forecasts
lines(y=forecast_sensor$mean, x= forecasted_dates, col="blue")