拼接时间间隔posixct

时间:2016-02-05 21:08:03

标签: r time posix intervals

我有以下时间间隔,我想分成10个等间距的实例。

head(data)
             stoptime           starttime
1 2014-08-19 14:52:04 2014-08-19 15:22:04
2 2014-08-19 16:27:14 2014-08-19 17:17:33
3 2014-08-19 18:05:59 2014-08-19 18:09:12
4 2014-08-19 17:25:35 2014-08-19 17:29:06
5 2014-08-19 18:23:29 2014-08-19 18:57:34
6 2014-08-19 07:39:15 2014-08-19 07:48:49

我可以使用此代码

获取中点
one_day$midtime = as.POSIXct((as.numeric(one_day$stoptime) + as.numeric(one_day$starttime)) /2 , origin = '1970-01-01')

然而,当我尝试将此代码扩展到十个等间距的实例时,它完全错误。为什么会发生这种情况?如何修复此代码?

one_day$first = as.POSIXct((as.numeric(one_day$stoptime) + as.numeric(one_day$starttime)) * .1 , origin = '1970-01-01')
one_day$second = as.POSIXct((as.numeric(one_day$stoptime) + as.numeric(one_day$starttime)) * .2, origin = '1970-01-01')
one_day$thrid = as.POSIXct((as.numeric(one_day$stoptime) + as.numeric(one_day$starttime)) * .3, origin = '1970-01-01')
one_day$fourth = as.POSIXct((as.numeric(one_day$stoptime) + as.numeric(one_day$starttime)) * .4, origin = '1970-01-01')
one_day$fifth = as.POSIXct((as.numeric(one_day$stoptime) + as.numeric(one_day$starttime)) * .5, origin = '1970-01-01')
one_day$sixth = as.POSIXct((as.numeric(one_day$stoptime) + as.numeric(one_day$starttime)) * .6, origin = '1970-01-01')
one_day$seventh = as.POSIXct((as.numeric(one_day$stoptime) + as.numeric(one_day$starttime)) * .7, origin = '1970-01-01')
one_day$eighth = as.POSIXct((as.numeric(one_day$stoptime) + as.numeric(one_day$starttime)) * .8, origin = '1970-01-01')
one_day$ninth = as.POSIXct((as.numeric(one_day$stoptime) + as.numeric(one_day$starttime)) * .9, origin = '1970-01-01')
head(one_day)
  diff.time            stoptime           starttime             midtime               first
1      1800 2014-08-19 14:52:04 2014-08-19 15:22:04 2014-08-19 15:07:04 1978-12-05 03:49:24
2      3019 2014-08-19 16:27:14 2014-08-19 17:17:33 2014-08-19 16:52:23 1978-12-05 04:10:28
3       193 2014-08-19 18:05:59 2014-08-19 18:09:12 2014-08-19 18:07:35 1978-12-05 04:25:31
4       211 2014-08-19 17:25:35 2014-08-19 17:29:06 2014-08-19 17:27:20 1978-12-05 04:17:28
5      2045 2014-08-19 18:23:29 2014-08-19 18:57:34 2014-08-19 18:40:31 1978-12-05 04:32:06
6       574 2014-08-19 07:39:15 2014-08-19 07:48:49 2014-08-19 07:44:02 1978-12-05 02:20:48
               second               thrid              fourth               fifth               sixth
1 1987-11-08 12:38:49 1996-10-11 21:28:14 2005-09-15 06:17:39 2014-08-19 15:07:04 2023-07-23 23:56:28
2 1987-11-08 13:20:57 1996-10-11 22:31:26 2005-09-15 07:41:54 2014-08-19 16:52:23 2023-07-24 02:02:52
3 1987-11-08 13:51:02 1996-10-11 23:16:33 2005-09-15 08:42:04 2014-08-19 18:07:35 2023-07-24 03:33:06
4 1987-11-08 13:34:56 1996-10-11 22:52:24 2005-09-15 08:09:52 2014-08-19 17:27:20 2023-07-24 02:44:48
5 1987-11-08 14:04:12 1996-10-11 23:36:18 2005-09-15 09:08:25 2014-08-19 18:40:31 2023-07-24 04:12:37
6 1987-11-08 09:41:36 1996-10-11 17:02:25 2005-09-15 00:23:13 2014-08-19 07:44:02 2023-07-23 15:04:50
              seventh              eighth               ninth
1 2032-06-26 08:45:53 2041-05-30 17:35:18 2050-05-04 02:24:43
2 2032-06-26 11:13:20 2041-05-30 20:23:49 2050-05-04 05:34:18
3 2032-06-26 12:58:37 2041-05-30 22:24:08 2050-05-04 07:49:39
4 2032-06-26 12:02:16 2041-05-30 21:19:44 2050-05-04 06:37:12
5 2032-06-26 13:44:44 2041-05-30 23:16:50 2050-05-04 08:48:56
6 2032-06-25 22:25:38 2041-05-30 05:46:27 2050-05-03 13:07:15
dput(data1)
structure(list(stoptime = structure(c(1408477924, 1408483634, 
1408489559, 1408487135, 1408490609, 1408451955, 1408452727, 1408498708, 
1408486644, 1408454996), class = c("POSIXct", "POSIXt"), tzone = "EST"), 
    starttime = structure(c(1408479724, 1408486653, 1408489752, 
    1408487346, 1408492654, 1408452529, 1408455826, 1408501153, 
    1408488389, 1408458514), class = c("POSIXct", "POSIXt"), tzone = "EST")), .Names = c("stoptime", 
"starttime"), row.names = c(NA, 10L), class = "data.frame")

2 个答案:

答案 0 :(得分:4)

1:Seq

首先,您必须将数据帧的列转换为POSIXct或POSIXlt类,因为r基函数seq具有该类对象的方法。

请看这个简化的代码:

library(lubridate)
a <- "2014-08-19 14:52:04"
b <- "2014-08-19 15:22:04"

a <- ymd_hms(a)
b <- ymd_hms(b)

a
[1] "2014-08-19 14:52:04 UTC"
b
[1] "2014-08-19 15:22:04 UTC"

然后你必须使用seq函数并将参数length.out设置为你正在寻找的序列的值。代码将自动创建从开始到结尾的值序列。

seq(a, b, length.out = 10)
[1] "2014-08-19 14:52:04 UTC" "2014-08-19 14:55:24 UTC"
[3] "2014-08-19 14:58:44 UTC" "2014-08-19 15:02:04 UTC"
[5] "2014-08-19 15:05:24 UTC" "2014-08-19 15:08:44 UTC"
[7] "2014-08-19 15:12:04 UTC" "2014-08-19 15:15:24 UTC"
[9] "2014-08-19 15:18:44 UTC" "2014-08-19 15:22:04 UTC"

2:向量化步骤1

现在您知道如何实现目标,只需要尝试如何沿着值进行矢量化。

我打赌有几种方法,这里有一种方法。使用mapply函数,您可以循环使用元素并将第一个元素(第一个对象)与第一个元素(第二个对象)匹配,依此类推。请记住,您必须使用MoreArg参数指定哪些参数已修复。

以下是代码:

mapply(seq,
       to = data1$starttime,
       from = data1$stoptime,
       MoreArgs = list(length.out = 10),
       SIMPLIFY = F)

会产生所需数据的列表,但不会产生所需的格式:

[[1]]
 [1] "2014-08-19 14:52:04 UTC" "2014-08-19 14:55:24 UTC"
 [3] "2014-08-19 14:58:44 UTC" "2014-08-19 15:02:04 UTC"
 [5] "2014-08-19 15:05:24 UTC" "2014-08-19 15:08:44 UTC"
 [7] "2014-08-19 15:12:04 UTC" "2014-08-19 15:15:24 UTC"
 [9] "2014-08-19 15:18:44 UTC" "2014-08-19 15:22:04 UTC"

[[2]]
 [1] "2014-08-19 16:27:14 UTC" "2014-08-19 16:32:49 UTC"
 [3] "2014-08-19 16:38:24 UTC" "2014-08-19 16:44:00 UTC"
 [5] "2014-08-19 16:49:35 UTC" "2014-08-19 16:55:11 UTC"
 [7] "2014-08-19 17:00:46 UTC" "2014-08-19 17:06:22 UTC"
 [9] "2014-08-19 17:11:57 UTC" "2014-08-19 17:17:33 UTC"

[[3]]
 [1] "2014-08-19 18:05:59 UTC" "2014-08-19 18:06:20 UTC"
 [3] "2014-08-19 18:06:41 UTC" "2014-08-19 18:07:03 UTC"
 [5] "2014-08-19 18:07:24 UTC" "2014-08-19 18:07:46 UTC"
 [7] "2014-08-19 18:08:07 UTC" "2014-08-19 18:08:29 UTC"
 [9] "2014-08-19 18:08:50 UTC" "2014-08-19 18:09:12 UTC"

[[4]]
 [1] "2014-08-19 17:25:35 UTC" "2014-08-19 17:25:58 UTC"
 [3] "2014-08-19 17:26:21 UTC" "2014-08-19 17:26:45 UTC"
 [5] "2014-08-19 17:27:08 UTC" "2014-08-19 17:27:32 UTC"
 [7] "2014-08-19 17:27:55 UTC" "2014-08-19 17:28:19 UTC"
 [9] "2014-08-19 17:28:42 UTC" "2014-08-19 17:29:06 UTC"

[[5]]
 [1] "2014-08-19 18:23:29 UTC" "2014-08-19 18:27:16 UTC"
 [3] "2014-08-19 18:31:03 UTC" "2014-08-19 18:34:50 UTC"
 [5] "2014-08-19 18:38:37 UTC" "2014-08-19 18:42:25 UTC"
 [7] "2014-08-19 18:46:12 UTC" "2014-08-19 18:49:59 UTC"
 [9] "2014-08-19 18:53:46 UTC" "2014-08-19 18:57:34 UTC"

[[6]]
 [1] "2014-08-19 07:39:15 UTC" "2014-08-19 07:40:18 UTC"
 [3] "2014-08-19 07:41:22 UTC" "2014-08-19 07:42:26 UTC"
 [5] "2014-08-19 07:43:30 UTC" "2014-08-19 07:44:33 UTC"
 [7] "2014-08-19 07:45:37 UTC" "2014-08-19 07:46:41 UTC"
 [9] "2014-08-19 07:47:45 UTC" "2014-08-19 07:48:49 UTC"

[[7]]
 [1] "2014-08-19 07:52:07 UTC" "2014-08-19 07:57:51 UTC"
 [3] "2014-08-19 08:03:35 UTC" "2014-08-19 08:09:20 UTC"
 [5] "2014-08-19 08:15:04 UTC" "2014-08-19 08:20:48 UTC"
 [7] "2014-08-19 08:26:33 UTC" "2014-08-19 08:32:17 UTC"
 [9] "2014-08-19 08:38:01 UTC" "2014-08-19 08:43:46 UTC"

[[8]]
 [1] "2014-08-19 20:38:28 UTC" "2014-08-19 20:42:59 UTC"
 [3] "2014-08-19 20:47:31 UTC" "2014-08-19 20:52:03 UTC"
 [5] "2014-08-19 20:56:34 UTC" "2014-08-19 21:01:06 UTC"
 [7] "2014-08-19 21:05:38 UTC" "2014-08-19 21:10:09 UTC"
 [9] "2014-08-19 21:14:41 UTC" "2014-08-19 21:19:13 UTC"

[[9]]
 [1] "2014-08-19 17:17:24 UTC" "2014-08-19 17:20:37 UTC"
 [3] "2014-08-19 17:23:51 UTC" "2014-08-19 17:27:05 UTC"
 [5] "2014-08-19 17:30:19 UTC" "2014-08-19 17:33:33 UTC"
 [7] "2014-08-19 17:36:47 UTC" "2014-08-19 17:40:01 UTC"
 [9] "2014-08-19 17:43:15 UTC" "2014-08-19 17:46:29 UTC"

[[10]]
 [1] "2014-08-19 08:29:56 UTC" "2014-08-19 08:36:26 UTC"
 [3] "2014-08-19 08:42:57 UTC" "2014-08-19 08:49:28 UTC"
 [5] "2014-08-19 08:55:59 UTC" "2014-08-19 09:02:30 UTC"
 [7] "2014-08-19 09:09:01 UTC" "2014-08-19 09:15:32 UTC"
 [9] "2014-08-19 09:22:03 UTC" "2014-08-19 09:28:34 UTC"

此时我想这只是一个相同的数据操作问题,但我无法找到一种方法(现在)。

答案 1 :(得分:1)

您不能将时间间隔乘以0.1,您必须将0.1的时间间隔添加到较早的时间。例如:

one_day$firstexample = one_day$stoptime + 0.1*difftime(one_day$starttime, one_day$stoptime, units = "mins")

作为旁注,如果您发现自己多次输入非常相似的内容,那通常表明您应该将其转换为函数。