我有以下时间间隔,我想分成10个等间距的实例。
head(data) stoptime starttime 1 2014-08-19 14:52:04 2014-08-19 15:22:04 2 2014-08-19 16:27:14 2014-08-19 17:17:33 3 2014-08-19 18:05:59 2014-08-19 18:09:12 4 2014-08-19 17:25:35 2014-08-19 17:29:06 5 2014-08-19 18:23:29 2014-08-19 18:57:34 6 2014-08-19 07:39:15 2014-08-19 07:48:49
我可以使用此代码
获取中点one_day$midtime = as.POSIXct((as.numeric(one_day$stoptime) + as.numeric(one_day$starttime)) /2 , origin = '1970-01-01')
然而,当我尝试将此代码扩展到十个等间距的实例时,它完全错误。为什么会发生这种情况?如何修复此代码?
one_day$first = as.POSIXct((as.numeric(one_day$stoptime) + as.numeric(one_day$starttime)) * .1 , origin = '1970-01-01') one_day$second = as.POSIXct((as.numeric(one_day$stoptime) + as.numeric(one_day$starttime)) * .2, origin = '1970-01-01') one_day$thrid = as.POSIXct((as.numeric(one_day$stoptime) + as.numeric(one_day$starttime)) * .3, origin = '1970-01-01') one_day$fourth = as.POSIXct((as.numeric(one_day$stoptime) + as.numeric(one_day$starttime)) * .4, origin = '1970-01-01') one_day$fifth = as.POSIXct((as.numeric(one_day$stoptime) + as.numeric(one_day$starttime)) * .5, origin = '1970-01-01') one_day$sixth = as.POSIXct((as.numeric(one_day$stoptime) + as.numeric(one_day$starttime)) * .6, origin = '1970-01-01') one_day$seventh = as.POSIXct((as.numeric(one_day$stoptime) + as.numeric(one_day$starttime)) * .7, origin = '1970-01-01') one_day$eighth = as.POSIXct((as.numeric(one_day$stoptime) + as.numeric(one_day$starttime)) * .8, origin = '1970-01-01') one_day$ninth = as.POSIXct((as.numeric(one_day$stoptime) + as.numeric(one_day$starttime)) * .9, origin = '1970-01-01')
head(one_day) diff.time stoptime starttime midtime first 1 1800 2014-08-19 14:52:04 2014-08-19 15:22:04 2014-08-19 15:07:04 1978-12-05 03:49:24 2 3019 2014-08-19 16:27:14 2014-08-19 17:17:33 2014-08-19 16:52:23 1978-12-05 04:10:28 3 193 2014-08-19 18:05:59 2014-08-19 18:09:12 2014-08-19 18:07:35 1978-12-05 04:25:31 4 211 2014-08-19 17:25:35 2014-08-19 17:29:06 2014-08-19 17:27:20 1978-12-05 04:17:28 5 2045 2014-08-19 18:23:29 2014-08-19 18:57:34 2014-08-19 18:40:31 1978-12-05 04:32:06 6 574 2014-08-19 07:39:15 2014-08-19 07:48:49 2014-08-19 07:44:02 1978-12-05 02:20:48 second thrid fourth fifth sixth 1 1987-11-08 12:38:49 1996-10-11 21:28:14 2005-09-15 06:17:39 2014-08-19 15:07:04 2023-07-23 23:56:28 2 1987-11-08 13:20:57 1996-10-11 22:31:26 2005-09-15 07:41:54 2014-08-19 16:52:23 2023-07-24 02:02:52 3 1987-11-08 13:51:02 1996-10-11 23:16:33 2005-09-15 08:42:04 2014-08-19 18:07:35 2023-07-24 03:33:06 4 1987-11-08 13:34:56 1996-10-11 22:52:24 2005-09-15 08:09:52 2014-08-19 17:27:20 2023-07-24 02:44:48 5 1987-11-08 14:04:12 1996-10-11 23:36:18 2005-09-15 09:08:25 2014-08-19 18:40:31 2023-07-24 04:12:37 6 1987-11-08 09:41:36 1996-10-11 17:02:25 2005-09-15 00:23:13 2014-08-19 07:44:02 2023-07-23 15:04:50 seventh eighth ninth 1 2032-06-26 08:45:53 2041-05-30 17:35:18 2050-05-04 02:24:43 2 2032-06-26 11:13:20 2041-05-30 20:23:49 2050-05-04 05:34:18 3 2032-06-26 12:58:37 2041-05-30 22:24:08 2050-05-04 07:49:39 4 2032-06-26 12:02:16 2041-05-30 21:19:44 2050-05-04 06:37:12 5 2032-06-26 13:44:44 2041-05-30 23:16:50 2050-05-04 08:48:56 6 2032-06-25 22:25:38 2041-05-30 05:46:27 2050-05-03 13:07:15
dput(data1) structure(list(stoptime = structure(c(1408477924, 1408483634, 1408489559, 1408487135, 1408490609, 1408451955, 1408452727, 1408498708, 1408486644, 1408454996), class = c("POSIXct", "POSIXt"), tzone = "EST"), starttime = structure(c(1408479724, 1408486653, 1408489752, 1408487346, 1408492654, 1408452529, 1408455826, 1408501153, 1408488389, 1408458514), class = c("POSIXct", "POSIXt"), tzone = "EST")), .Names = c("stoptime", "starttime"), row.names = c(NA, 10L), class = "data.frame")
答案 0 :(得分:4)
首先,您必须将数据帧的列转换为POSIXct或POSIXlt类,因为r基函数seq
具有该类对象的方法。
请看这个简化的代码:
library(lubridate)
a <- "2014-08-19 14:52:04"
b <- "2014-08-19 15:22:04"
a <- ymd_hms(a)
b <- ymd_hms(b)
a
[1] "2014-08-19 14:52:04 UTC"
b
[1] "2014-08-19 15:22:04 UTC"
然后你必须使用seq
函数并将参数length.out
设置为你正在寻找的序列的值。代码将自动创建从开始到结尾的值序列。
seq(a, b, length.out = 10)
[1] "2014-08-19 14:52:04 UTC" "2014-08-19 14:55:24 UTC"
[3] "2014-08-19 14:58:44 UTC" "2014-08-19 15:02:04 UTC"
[5] "2014-08-19 15:05:24 UTC" "2014-08-19 15:08:44 UTC"
[7] "2014-08-19 15:12:04 UTC" "2014-08-19 15:15:24 UTC"
[9] "2014-08-19 15:18:44 UTC" "2014-08-19 15:22:04 UTC"
现在您知道如何实现目标,只需要尝试如何沿着值进行矢量化。
我打赌有几种方法,这里有一种方法。使用mapply
函数,您可以循环使用元素并将第一个元素(第一个对象)与第一个元素(第二个对象)匹配,依此类推。请记住,您必须使用MoreArg
参数指定哪些参数已修复。
以下是代码:
mapply(seq,
to = data1$starttime,
from = data1$stoptime,
MoreArgs = list(length.out = 10),
SIMPLIFY = F)
会产生所需数据的列表,但不会产生所需的格式:
[[1]]
[1] "2014-08-19 14:52:04 UTC" "2014-08-19 14:55:24 UTC"
[3] "2014-08-19 14:58:44 UTC" "2014-08-19 15:02:04 UTC"
[5] "2014-08-19 15:05:24 UTC" "2014-08-19 15:08:44 UTC"
[7] "2014-08-19 15:12:04 UTC" "2014-08-19 15:15:24 UTC"
[9] "2014-08-19 15:18:44 UTC" "2014-08-19 15:22:04 UTC"
[[2]]
[1] "2014-08-19 16:27:14 UTC" "2014-08-19 16:32:49 UTC"
[3] "2014-08-19 16:38:24 UTC" "2014-08-19 16:44:00 UTC"
[5] "2014-08-19 16:49:35 UTC" "2014-08-19 16:55:11 UTC"
[7] "2014-08-19 17:00:46 UTC" "2014-08-19 17:06:22 UTC"
[9] "2014-08-19 17:11:57 UTC" "2014-08-19 17:17:33 UTC"
[[3]]
[1] "2014-08-19 18:05:59 UTC" "2014-08-19 18:06:20 UTC"
[3] "2014-08-19 18:06:41 UTC" "2014-08-19 18:07:03 UTC"
[5] "2014-08-19 18:07:24 UTC" "2014-08-19 18:07:46 UTC"
[7] "2014-08-19 18:08:07 UTC" "2014-08-19 18:08:29 UTC"
[9] "2014-08-19 18:08:50 UTC" "2014-08-19 18:09:12 UTC"
[[4]]
[1] "2014-08-19 17:25:35 UTC" "2014-08-19 17:25:58 UTC"
[3] "2014-08-19 17:26:21 UTC" "2014-08-19 17:26:45 UTC"
[5] "2014-08-19 17:27:08 UTC" "2014-08-19 17:27:32 UTC"
[7] "2014-08-19 17:27:55 UTC" "2014-08-19 17:28:19 UTC"
[9] "2014-08-19 17:28:42 UTC" "2014-08-19 17:29:06 UTC"
[[5]]
[1] "2014-08-19 18:23:29 UTC" "2014-08-19 18:27:16 UTC"
[3] "2014-08-19 18:31:03 UTC" "2014-08-19 18:34:50 UTC"
[5] "2014-08-19 18:38:37 UTC" "2014-08-19 18:42:25 UTC"
[7] "2014-08-19 18:46:12 UTC" "2014-08-19 18:49:59 UTC"
[9] "2014-08-19 18:53:46 UTC" "2014-08-19 18:57:34 UTC"
[[6]]
[1] "2014-08-19 07:39:15 UTC" "2014-08-19 07:40:18 UTC"
[3] "2014-08-19 07:41:22 UTC" "2014-08-19 07:42:26 UTC"
[5] "2014-08-19 07:43:30 UTC" "2014-08-19 07:44:33 UTC"
[7] "2014-08-19 07:45:37 UTC" "2014-08-19 07:46:41 UTC"
[9] "2014-08-19 07:47:45 UTC" "2014-08-19 07:48:49 UTC"
[[7]]
[1] "2014-08-19 07:52:07 UTC" "2014-08-19 07:57:51 UTC"
[3] "2014-08-19 08:03:35 UTC" "2014-08-19 08:09:20 UTC"
[5] "2014-08-19 08:15:04 UTC" "2014-08-19 08:20:48 UTC"
[7] "2014-08-19 08:26:33 UTC" "2014-08-19 08:32:17 UTC"
[9] "2014-08-19 08:38:01 UTC" "2014-08-19 08:43:46 UTC"
[[8]]
[1] "2014-08-19 20:38:28 UTC" "2014-08-19 20:42:59 UTC"
[3] "2014-08-19 20:47:31 UTC" "2014-08-19 20:52:03 UTC"
[5] "2014-08-19 20:56:34 UTC" "2014-08-19 21:01:06 UTC"
[7] "2014-08-19 21:05:38 UTC" "2014-08-19 21:10:09 UTC"
[9] "2014-08-19 21:14:41 UTC" "2014-08-19 21:19:13 UTC"
[[9]]
[1] "2014-08-19 17:17:24 UTC" "2014-08-19 17:20:37 UTC"
[3] "2014-08-19 17:23:51 UTC" "2014-08-19 17:27:05 UTC"
[5] "2014-08-19 17:30:19 UTC" "2014-08-19 17:33:33 UTC"
[7] "2014-08-19 17:36:47 UTC" "2014-08-19 17:40:01 UTC"
[9] "2014-08-19 17:43:15 UTC" "2014-08-19 17:46:29 UTC"
[[10]]
[1] "2014-08-19 08:29:56 UTC" "2014-08-19 08:36:26 UTC"
[3] "2014-08-19 08:42:57 UTC" "2014-08-19 08:49:28 UTC"
[5] "2014-08-19 08:55:59 UTC" "2014-08-19 09:02:30 UTC"
[7] "2014-08-19 09:09:01 UTC" "2014-08-19 09:15:32 UTC"
[9] "2014-08-19 09:22:03 UTC" "2014-08-19 09:28:34 UTC"
此时我想这只是一个相同的数据操作问题,但我无法找到一种方法(现在)。
答案 1 :(得分:1)
您不能将时间间隔乘以0.1,您必须将0.1的时间间隔添加到较早的时间。例如:
one_day$firstexample = one_day$stoptime + 0.1*difftime(one_day$starttime, one_day$stoptime, units = "mins")
作为旁注,如果您发现自己多次输入非常相似的内容,那通常表明您应该将其转换为函数。