插值15分钟值

时间:2018-01-30 10:19:42

标签: r time-series interpolation zoo

我有一个如下所示的数据框:

dat <- data.frame(time = seq(as.POSIXct("2010-01-01"),
                             as.POSIXct("2016-12-31") + 60*99, 
                             by = 60*15),
                  radiation = sample(1:500, 245383, replace = TRUE))

所以我每15分钟就有一个测量值。结构是:

> str(dat)
'data.frame':   245383 obs. of  2 variables:
 $ time     : POSIXct, format: "2010-01-01 00:00:00" "2010-01-01 00:15:00" "2010-01-01 00:30:00" "2010-01-01 00:45:00" ...
 $ radiation: num  230 443 282 314 286 225 77 89 97 330 ...

现在我想插值,所以我的目标是每分钟都有值的数据框。 我搜索了几次并尝试了动物园包的一些方法。但是我对数据帧有一些问题。我想将它转换为文本文件?我不知道该怎么做。

3 个答案:

答案 0 :(得分:0)

这是一个tidyverse解决方案。

library('tidyverse')

dat <- data.frame(time = seq(as.POSIXct("2010-01-01"),
                             as.POSIXct("2016-12-31") + 60*99, 
                             by = 60*15),
                  radiation = sample(1:500, 245383, replace = TRUE))

dat <- head(dat, 3)
dat
#                  time radiation
# 1 2010-01-01 00:00:00       241
# 2 2010-01-01 00:15:00       438
# 3 2010-01-01 00:30:00       457

您可以使用所有必需的time创建数据框。使用full_join会将缺失的radiation值设为NA

approx将使用线性近似值填充NA

dat %>%
  full_join(data.frame(time = seq(
    from = min(.$time),
    to = max(.$time),
    by = 'min'))) %>%
  arrange(time) %>%
  mutate(radiation = approx(radiation, n = n())$y)
# Joining, by = "time"
#                   time radiation
# 1  2010-01-01 00:00:00  241.0000
# 2  2010-01-01 00:01:00  254.1333
# 3  2010-01-01 00:02:00  267.2667
# 4  2010-01-01 00:03:00  280.4000
# 5  2010-01-01 00:04:00  293.5333
# 6  2010-01-01 00:05:00  306.6667
# 7  2010-01-01 00:06:00  319.8000
# 8  2010-01-01 00:07:00  332.9333
# 9  2010-01-01 00:08:00  346.0667
# 10 2010-01-01 00:09:00  359.2000
# 11 2010-01-01 00:10:00  372.3333
# 12 2010-01-01 00:11:00  385.4667
# 13 2010-01-01 00:12:00  398.6000
# 14 2010-01-01 00:13:00  411.7333
# 15 2010-01-01 00:14:00  424.8667
# 16 2010-01-01 00:15:00  438.0000
# 17 2010-01-01 00:16:00  439.2667
# 18 2010-01-01 00:17:00  440.5333
# 19 2010-01-01 00:18:00  441.8000
# 20 2010-01-01 00:19:00  443.0667
# 21 2010-01-01 00:20:00  444.3333
# 22 2010-01-01 00:21:00  445.6000
# 23 2010-01-01 00:22:00  446.8667
# 24 2010-01-01 00:23:00  448.1333
# 25 2010-01-01 00:24:00  449.4000
# 26 2010-01-01 00:25:00  450.6667
# 27 2010-01-01 00:26:00  451.9333
# 28 2010-01-01 00:27:00  453.2000
# 29 2010-01-01 00:28:00  454.4667
# 30 2010-01-01 00:29:00  455.7333
# 31 2010-01-01 00:30:00  457.0000

答案 1 :(得分:0)

以下是使用pad包中的padr来填补时间列中的空白的解决方案。 na.approx用于插值。

library(padr)
library(zoo)

dat[1:2, ]
                  time radiation
#1 2010-01-01 00:00:00       133
#2 2010-01-01 00:15:00       187

dat_padded <- pad(dat[1:2, ], interval = "min")
dat_padded$radiation <- zoo::na.approx(dat_padded$radiation)
dat_padded
                   time radiation
#1  2010-01-01 00:00:00     133.0
#2  2010-01-01 00:01:00     136.6
#3  2010-01-01 00:02:00     140.2
#4  2010-01-01 00:03:00     143.8
#5  2010-01-01 00:04:00     147.4
#6  2010-01-01 00:05:00     151.0
#7  2010-01-01 00:06:00     154.6
#8  2010-01-01 00:07:00     158.2
#9  2010-01-01 00:08:00     161.8
#10 2010-01-01 00:09:00     165.4
#11 2010-01-01 00:10:00     169.0
#12 2010-01-01 00:11:00     172.6
#13 2010-01-01 00:12:00     176.2
#14 2010-01-01 00:13:00     179.8
#15 2010-01-01 00:14:00     183.4
#16 2010-01-01 00:15:00     187.0

数据

set.seed(1)
dat <-
  data.frame(
    time = seq(
      as.POSIXct("2010-01-01"),
      as.POSIXct("2016-12-31") + 60 * 99,
      by = 60 * 15
    ),
    radiation = sample(1:500, 245383, replace = TRUE)
  )

答案 2 :(得分:0)

您可以使用approx这样的功能:

dat <- data.frame(time = seq(as.POSIXct("2016-12-01"),
                             as.POSIXct("2016-12-31") + 60*99, 
                             by = 60*15),
                  radiation = sample(1:500, 2887, replace = TRUE))

mins <- seq(as.POSIXct("2016-12-01"),
            as.POSIXct("2016-12-31") + 60*99, 
            by = 60)

out <- approx(dat$time, dat$radiation, mins)