我需要从时间序列中提取2个数据点,然后以指定的间隔为数据集中的每个点存储它们。
例如,如果我有以下数据集:
datetime O2av Qav A Ka
11/07/2013 19:16 8.493 123.73 1276.270667 0.333133208
11/07/2013 19:17 8.496 123.73 1276.270667 0.331041617
11/07/2013 19:18 8.494 123.73 1276.270667 0.334246882
11/07/2013 19:19 8.4955 123.73 1276.270667 0.333804959
11/07/2013 19:20 8.493 123.73 1276.270667 0.338569186
11/07/2013 19:21 8.494 123.73 1276.270667 0.338476611
11/07/2013 19:22 8.4935 123.73 1276.270667 0.339429955
11/07/2013 19:23 8.492 123.73 1276.270667 0.342290738
11/07/2013 19:24 8.4895 123.73 1276.270667 0.345244346
11/07/2013 19:25 8.488 123.73 1276.270667 0.347501258
11/07/2013 19:26 8.489 123.73 1276.270667 0.349227795
11/07/2013 19:27 8.4855 123.73 1276.270667 0.352274231
11/07/2013 19:28 8.482 123.73 1276.270667 0.357140658
11/07/2013 19:29 8.4795 123.73 1276.270667 0.359490523
11/07/2013 19:30 8.48 123.73 1276.270667 0.360356046
11/07/2013 19:31 8.4765 123.73 1276.270667 0.365225985
11/07/2013 19:32 8.473 123.73 1276.270667 0.369489804
11/07/2013 19:33 8.469 123.73 1276.270667 0.375320489
11/07/2013 19:34 8.4655 123.73 1276.270667 0.379587326
11/07/2013 19:35 8.46 123.73 1276.270667 0.384640303
11/07/2013 19:36 8.461 123.73 1276.270667 0.385771643
11/07/2013 19:37 8.4525 123.73 1276.270667 0.394747899
11/07/2013 19:38 8.448 123.73 1276.270667 0.39849568
11/07/2013 19:39 8.4465 123.73 1276.270667 0.401373418
11/07/2013 19:40 8.4415 123.73 1276.270667 0.406692482
...然后我想进行计算:
met <- data.frame(O2avtime2-O2avtime1-Ka)*1000*(Qav)/A)
其中O2avtime2
是11/07/2013 19:20,O2avtime1
是前4分钟的点,例如11/07/2013 19:16然后,我将如何为每个点进行此计算,例如,下一个点将是O2avtime2
为11/07/2013 19:21,O2avtime1
为11/07/2013 19:17?然后使用相应的时间戳(例如O2avtime2
)存储此数组。
所以输出看起来像这样:
datetime met
11/07/2013 19:20 -32.82310443
11/07/2013 19:21 -33.00802265
11/07/2013 19:22 -32.95502625
11/07/2013 19:23 -33.52320877
11/07/2013 19:24 -33.80955078
11/07/2013 19:25 -34.27071685
11/07/2013 19:26 -34.29267882
11/07/2013 19:27 -34.78191323
11/07/2013 19:28 -35.35064292
11/07/2013 19:29 -35.67540067
11/07/2013 19:30 -35.80778337
11/07/2013 19:31 -36.27990701
11/07/2013 19:32 -36.69326943
11/07/2013 19:33 -37.40395383
11/07/2013 19:34 -38.20539491
11/07/2013 19:35 -38.88915649
11/07/2013 19:36 -38.56257662
11/07/2013 19:37 -39.86905275
11/07/2013 19:38 -40.32933359
11/07/2013 19:39 -40.2205342
11/07/2013 19:40 -41.31787807
基本上,除了一个之外的所有行都是同一行的计算,O2avtime1
除外(O2av
列),总是落后4分钟。
答案 0 :(得分:1)
如果您每分钟只有一行,那么您可以使用diff
。如果您的data.frame
被称为df3
,则可以使用:
(c(rep(NA,4),diff(df3$O2av,lag=4))-df3$Ka)*1000*df3$Qav/df3$A
# [1] NA NA NA NA -32.82310 -33.00802 -32.95503 -33.52321
# [9] -33.80955 -34.27072 -34.29268 -34.78191 -35.35064 -35.67540 -35.80778 -36.27991
# [17] -36.69327 -37.40395 -38.20539 -38.88916 -38.56258 -39.86905 -40.32933 -40.22053
# [25] -41.31788
答案 1 :(得分:1)
这样的事情应该适合你,而不是假设你每四分钟就有一次数据。它将在数据框中创建一个与4分钟前不同的新列,如果不存在则差值为NA。然后,您可以根据该差异计算得出。
# Create fake data
datetime <- paste('11/07/2013 19:', seq(16,22), sep='')
O2av <- seq(1:7)
df <- data.frame(datetime=datetime, O2av=O2av)
df$Qav <- 123.73
df$A <- 1276.270667
df$Ka <- .33
library(timeDate)
df$td <- as.POSIXlt(timeDate(datetime, '%d/%m/%Y %H:%M'))
# Set the row.names to the time stamp
row.names(df) <- as.character(df$td)
# Create a new column with a time stamp that is 4 minutes ago
df$minus4 <- df$td
df$minus4$min <- df$minus4$min-4 # This is safe on a POSIX object.
# Create the difference as a new column
df$diff <- df$O2av - df[as.character(df$minus4), 'O2av']
df$met <- (df$diff-df$Ka)*1000*(df$Qav)/df$A