亲爱的用户我正在研究重新采样我的时间序列数据的方法。基本上我的问题是我必须使用具有不同时间戳的不同信号进行计算。一开始我想用aproxfun()或者动物园包。
问题是我需要保留t-1和t + 1的NAN值,以防它们存在且只有在满足特定条件的情况下。为了提取信号的不同时间戳,我做了以下一个:
GlobalTime<-sort(do.call(`c`, data[c(1, 3, 5)])) #extract the total timestamp of the measurements
Time1<-seq.POSIXt(from=min(data$X1, na.rm=TRUE), to=max(data$X1, na.rm=TRUE), by="s") #extract the timestamp of each measurment
Time2<-seq.POSIXt(from=min(data$X3, na.rm=TRUE), to=max(data$X3, na.rm=TRUE), by="s")
Time3<-seq.POSIXt(from=min(data$X5, na.rm=TRUE), to=max(data$X5, na.rm=TRUE), by="s")
#All vectors have different size
对于插值,我需要考虑3点。第一个,来自Time1,另外两个来自Global Time(这两个是最接近Time1 [i]的)。我认为这可以这样做:
upperlimits1=c()
lowerlimits1=c()
for (i in 1:length(actualTime)) {
upperlimits1[i]=min(Mod(actualTime[i+1]-requestTime1[i]))
lowerlimits1[i]=min(Mod(actualTime[i-1]-requestTime1[i]))
}
稍后为了使插值做类似的事情:
for (i in 1:length(actualTime)) {
if(lowerlimits1[i]==Time1[i])
resample[i,1]<-value[Time[i-1]]
resample[i,2]<-Time1[i-1]
resample[i,3]<-"OK"
i++
} else if(upperlimits1[i]==Time1[i]){
resample[i,1]<-value[Time[i+1]]
resample[i,2]<-Time1[i+1]
resample[i,3]<-"OK"
i++
} else if (upperlimits1[i]!=Time1[i] & lowerlimits1[i]!=Time1[i] & aux[i-1]=="Wrong"){
resample[i,1]<-value[Time1[i-1]]
resample[i,2]<-Time1[i-1]
resample[i,3]<-"Wrong"
i++
} else if (upperlimits1[i]!=Time1[i] & lowerlimits1[i]!=Time1[i] & aux[i+1]=="Wrong"){
resample[i,1]<-value[Time1[i+1]]
resample[i,2]<-Time1[i+1]
resample[i,3]<-"Wrong"
i++
} else if (aux[i]=="C"){
resample<-value[Time1[i-1]]
resample[i,2]<-Time1[i-1]
resample[i,3]<-aux[i-1]
i++
} else {
Delta=upperlimits1[i]-lowerlimits1[i]
if (Delta>3600){
resample[i,3]<-"Wrong"
} if (Delta<3600){
resample[i,3]<-"OK"
coeff=(Time1[i]-lowerlimits1[i]) / (upperlimits1[i]-lowerlimits1[i]);
resample[i,1]=value[lowerlimits1[i]] + (value[upperlimits1[i] - value[lowerlimits1[i]]) * coeff);
i++
}
编辑:根据要求,我添加了数据的样子:
Time1 Value1 Quality1 Time2 Value2 Quality2...
00:00.9 41.3 Ok 00:04.0 78.2 Ok
00:01.9 41.68 Ok 00:07.0 78.5 Ok
00:04.9 35.34 Ok 00:08.0 66 Ok
00:15.9 35.98 Ok 00:14.0 65.8 Ok
00:21.9 Wrong Wrong 00:15.0 64.5 Ok
00:22.9 38.9 Ok 00:19.0 40.5 Ok
59:56.5 40.1 Ok 00:21.0 30.5 Ok
#of course sometimes wrong values can be recorded for longer periods as hours or days
#at the end the interpolation has the purpose of having the three or more signals with same timestamp
所需的输出
Time1 Value1 Quality1 Time2 Value2 Quality2...
01:00:00 IntValue Ok 01:00:00 IntValue Ok
02:00:00 IntValue Ok 02:00:00 IntValue Ok
03:00:00 IntValue Ok 03:00:00 IntValue Ok
04:00:00 IntValue Ok 04:00:00 IntValue Ok
05:00:00 IntValue Ok 05:00:00 IntValue Ok
06:00:00 IntValue Ok 06:00:00 IntValue Ok
07:00:00 IntValue Ok 07:00:00 IntValue Ok
但是如果质量列错误超过24小时,插值应写入
然而,您可以想象代码不起作用。有人试过这个吗?
BR