在R中以指定间隔从时间序列中提取两个点

时间:2013-12-17 10:46:39

标签: r timestamp time-series subset

我需要从时间序列中提取2个数据点,然后以指定的间隔为数据集中的每个点存储它们。

例如,如果我有以下数据集:

datetime            O2av    Qav     A           Ka
11/07/2013 19:16    8.493   123.73  1276.270667 0.333133208
11/07/2013 19:17    8.496   123.73  1276.270667 0.331041617
11/07/2013 19:18    8.494   123.73  1276.270667 0.334246882
11/07/2013 19:19    8.4955  123.73  1276.270667 0.333804959
11/07/2013 19:20    8.493   123.73  1276.270667 0.338569186
11/07/2013 19:21    8.494   123.73  1276.270667 0.338476611
11/07/2013 19:22    8.4935  123.73  1276.270667 0.339429955
11/07/2013 19:23    8.492   123.73  1276.270667 0.342290738
11/07/2013 19:24    8.4895  123.73  1276.270667 0.345244346
11/07/2013 19:25    8.488   123.73  1276.270667 0.347501258
11/07/2013 19:26    8.489   123.73  1276.270667 0.349227795
11/07/2013 19:27    8.4855  123.73  1276.270667 0.352274231
11/07/2013 19:28    8.482   123.73  1276.270667 0.357140658
11/07/2013 19:29    8.4795  123.73  1276.270667 0.359490523
11/07/2013 19:30    8.48    123.73  1276.270667 0.360356046
11/07/2013 19:31    8.4765  123.73  1276.270667 0.365225985
11/07/2013 19:32    8.473   123.73  1276.270667 0.369489804
11/07/2013 19:33    8.469   123.73  1276.270667 0.375320489
11/07/2013 19:34    8.4655  123.73  1276.270667 0.379587326
11/07/2013 19:35    8.46    123.73  1276.270667 0.384640303
11/07/2013 19:36    8.461   123.73  1276.270667 0.385771643
11/07/2013 19:37    8.4525  123.73  1276.270667 0.394747899
11/07/2013 19:38    8.448   123.73  1276.270667 0.39849568
11/07/2013 19:39    8.4465  123.73  1276.270667 0.401373418
11/07/2013 19:40    8.4415  123.73  1276.270667 0.406692482

...然后我想进行计算:

met <- data.frame(O2avtime2-O2avtime1-Ka)*1000*(Qav)/A)

其中O2avtime2是11/07/2013 19:20,O2avtime1是前4分钟的点,例如11/07/2013 19:16然后,我将如何为每个点进行此计算,例如,下一个点将是O2avtime2为11/07/2013 19:21,O2avtime1为11/07/2013 19:17?然后使用相应的时间戳(例如O2avtime2)存储此数组。

所以输出看起来像这样:

datetime                 met 
11/07/2013 19:20    -32.82310443
11/07/2013 19:21    -33.00802265
11/07/2013 19:22    -32.95502625
11/07/2013 19:23    -33.52320877
11/07/2013 19:24    -33.80955078
11/07/2013 19:25    -34.27071685
11/07/2013 19:26    -34.29267882
11/07/2013 19:27    -34.78191323
11/07/2013 19:28    -35.35064292
11/07/2013 19:29    -35.67540067
11/07/2013 19:30    -35.80778337
11/07/2013 19:31    -36.27990701
11/07/2013 19:32    -36.69326943
11/07/2013 19:33    -37.40395383
11/07/2013 19:34    -38.20539491
11/07/2013 19:35    -38.88915649
11/07/2013 19:36    -38.56257662
11/07/2013 19:37    -39.86905275
11/07/2013 19:38    -40.32933359
11/07/2013 19:39    -40.2205342
11/07/2013 19:40    -41.31787807

基本上,除了一个之外的所有行都是同一行的计算,O2avtime1除外(O2av列),总是落后4分钟。

2 个答案:

答案 0 :(得分:1)

如果您每分钟只有一行,那么您可以使用diff。如果您的data.frame被称为df3,则可以使用:

(c(rep(NA,4),diff(df3$O2av,lag=4))-df3$Ka)*1000*df3$Qav/df3$A
# [1]        NA        NA        NA        NA -32.82310 -33.00802 -32.95503 -33.52321
# [9]  -33.80955 -34.27072 -34.29268 -34.78191 -35.35064 -35.67540 -35.80778 -36.27991
# [17] -36.69327 -37.40395 -38.20539 -38.88916 -38.56258 -39.86905 -40.32933 -40.22053
# [25] -41.31788    

答案 1 :(得分:1)

这样的事情应该适合你,而不是假设你每四分钟就有一次数据。它将在数据框中创建一个与4分钟前不同的新列,如果不存在则差值为NA。然后,您可以根据该差异计算得出。

# Create fake data
datetime <- paste('11/07/2013 19:', seq(16,22), sep='')
O2av <- seq(1:7)

df <- data.frame(datetime=datetime, O2av=O2av)
df$Qav <- 123.73
df$A <- 1276.270667
df$Ka <- .33

library(timeDate)
df$td <- as.POSIXlt(timeDate(datetime, '%d/%m/%Y %H:%M'))

# Set the row.names to the time stamp
row.names(df) <- as.character(df$td)

# Create a new column with a time stamp that is 4 minutes ago
df$minus4 <- df$td
df$minus4$min <- df$minus4$min-4 # This is safe on a POSIX object.

# Create the difference as a new column
df$diff <- df$O2av - df[as.character(df$minus4), 'O2av']

df$met <- (df$diff-df$Ka)*1000*(df$Qav)/df$A