我有时间数据的数据库。我想将数据插入特定的时间步骤。
Id Time humid humtemp prtemp press t
1 2012-01-21 18:41:50 47.7 14.12 13.870 1005.70 -0.05277778
1 2012-01-21 18:46:43 44.5 15.37 15.100 1005.20 0.02861111
1 2012-01-21 18:51:35 43.2 15.88 15.576 1005.10 0.10972222
1 2012-01-21 18:56:28 42.5 16.17 15.833 1004.90 0.19111111
1 2012-01-21 19:01:21 42.2 16.31 15.986 1004.80 0.27250000
1 2012-01-21 19:06:14 41.8 16.47 16.118 1004.60 0.35388889
1 2012-01-21 19:11:07 41.6 16.51 16.177 1004.60 0.43527778
我想通过插入以下时间步获取数据。
Id Time humid humtemp prtemp press t
1 2012-01-21 18:45:00 .... ... ..... .... ....
1 2012-01-21 18:50:00 ....
1 2012-01-21 18:55:00 ....
1 2012-01-21 19:00:00 ....
1 2012-01-21 19:05:00 ....
1 2012-01-21 19:10:00 ....
我尝试了不同的方法,但我没有找到解决方案。例如,我创建了zoo对象。
z <- zoo(MTS01m,order.by=MTS01m$Time)
tstart2<-asP("2012-01-21 18:45:00")
Ts<-1*60
y <- merge(z, zoo(order.by=seq(tstart2, end(z), by=Ts)))
xa <- na.approx(y)
xs <- na.spline(y)
但发生错误:
Errore in approx(x[!na], y[!na], xout, ...) :
need at least two non-NA values to interpolate
Inoltre: Warning message:
In xy.coords(x, y) : si è prodotto un NA per coercizione
我创建了一个从我想要数据开始的secundary索引t,但我不知道如何使用这个索引。
你有什么建议吗?
答案 0 :(得分:3)
试试这个(假设你的时间索引是POSIXct):
library(zoo)
st <- as.POSIXct("2012-01-21 18:45")
g <- seq(st, end(z), by = "15 min") # grid
na.approx(z, xout = g)
有关详细信息,请参阅?na.approx.zoo
。
注意:由于问题没有以可重复的形式提供数据,我们在此处这样做:
Lines <- "Id date Time humid humtemp prtemp press t1
1 2012-01-21 18:41:50 47.7 14.12 13.870 1005.70 -0.05277778
1 2012-01-21 18:46:43 44.5 15.37 15.100 1005.20 0.02861111
1 2012-01-21 18:51:35 43.2 15.88 15.576 1005.10 0.10972222
1 2012-01-21 18:56:28 42.5 16.17 15.833 1004.90 0.19111111
1 2012-01-21 19:01:21 42.2 16.31 15.986 1004.80 0.27250000
1 2012-01-21 19:06:14 41.8 16.47 16.118 1004.60 0.35388889
1 2012-01-21 19:11:07 41.6 16.51 16.177 1004.60 0.43527778"
library(zoo)
z <- read.zoo(text = Lines, header = TRUE, index = 2:3, tz = "")
st <- as.POSIXct("2012-01-21 18:45")
g <- seq(st, end(z), by = "15 min") # grid
na.approx(z, xout = g)
,并提供:
Id humid humtemp prtemp press t1
2012-01-21 18:45:00 1 45.62491 14.93058 14.66761 1005.376 -1.501706e-09
2012-01-21 19:00:00 1 42.28294 16.27130 15.94370 1004.828 2.500000e-01
答案 1 :(得分:2)
您可以看到以下流程:
创建数据集:
data1 <- read.table(text="1 2012-01-21 18:41:50 47.7 14.12 13.870 1005.70 -0.05277778
1 2012-01-21 18:46:43 44.5 15.37 15.100 1005.20 0.02861111
1 2012-01-21 18:51:35 43.2 15.88 15.576 1005.10 0.10972222
1 2012-01-21 18:56:28 42.5 16.17 15.833 1004.90 0.19111111
1 2012-01-21 19:01:21 42.2 16.31 15.986 1004.80 0.27250000
1 2012-01-21 19:06:14 41.8 16.47 16.118 1004.60 0.35388889
1 2012-01-21 19:11:07 41.6 16.51 16.177 1004.60 0.43527778",
col.names=c("Id","date","Time","humid","humtemp","prtemp","press","t1"))
data1$datetime <- strptime(as.character(paste(d$date,d$Time, sep=" ")),"%Y-%m-%d %H:%M:%S")
图书馆动物园:
library(zoo)
第1步:
# sequence interval 5 seconds
seq1 <- zoo(order.by=(as.POSIXlt( seq(min(data1$datetime), max(data1$datetime), by=5) )))
第2步:
mer1 <- merge(zoo(x=data1[4:7],order.by=data1$datetime), seq1)
第3步:
#Constant interpolation
dataC <- na.approx(mer1, method="constant")
#Linear interpolation
dataL <- na.approx(mer1)
<强>可视化强>
head(dataC)
humid humtemp prtemp press
2012-01-21 18:41:50 47.7 14.12 13.87 1005.7
2012-01-21 18:41:55 47.7 14.12 13.87 1005.7
2012-01-21 18:42:00 47.7 14.12 13.87 1005.7
2012-01-21 18:42:05 47.7 14.12 13.87 1005.7
2012-01-21 18:42:10 47.7 14.12 13.87 1005.7
2012-01-21 18:42:15 47.7 14.12 13.87 1005.7
head(dataL)
humid humtemp prtemp press
2012-01-21 18:41:50 47.70000 14.12000 13.87000 1005.700
2012-01-21 18:41:55 47.64539 14.14133 13.89099 1005.691
2012-01-21 18:42:00 47.59078 14.16266 13.91198 1005.683
2012-01-21 18:42:05 47.53618 14.18399 13.93297 1005.674
2012-01-21 18:42:10 47.48157 14.20532 13.95396 1005.666
2012-01-21 18:42:15 47.42696 14.22666 13.97495 1005.657
答案 2 :(得分:0)
我在xts包(或动物园)中找不到与给定日期近似的函数。
所以,我的想法是在给定日期的原始ts中插入NA。
ids <- as.POSIXct( align.time(index(dat.xts),60*5)) # range dates
# I create an xts with NA
y <- xts(x=matrix(data=NA,nrow=dim(dat.xts)[1],
ncol=dim(dat.xts)[2]),
order.by=ids)
rbind(y,dat.xts)
humid humtemp prtemp press t
2012-01-21 18:41:50 47.7 14.12 13.870 1005.7 -0.05277778
2012-01-21 18:45:00 NA NA NA NA NA
2012-01-21 18:46:43 44.5 15.37 15.100 1005.2 0.02861111
2012-01-21 18:50:00 NA NA NA NA NA
2012-01-21 18:51:35 43.2 15.88 15.576 1005.1 0.10972222
2012-01-21 18:55:00 NA NA NA NA NA
现在您可以像这样使用na.approx或na.spline
na.approx(rbind(y,dat.xts))[index(y)]
humid humtemp prtemp press t
2012-01-21 18:45:00 45.62 14.93 14.67 1005.38 0.00
2012-01-21 18:50:00 43.62 15.71 15.42 1005.13 0.08
2012-01-21 18:55:00 42.71 16.08 15.76 1004.96 0.17
2012-01-21 19:00:00 42.28 16.27 15.94 1004.83 0.25
2012-01-21 19:05:00 41.90 16.43 16.08 1004.65 0.33
2012-01-21 19:10:00 41.65 16.50 16.16 1004.60 0.42