我想在交通灯的每个交通周期中相对计算绿色,琥珀色,红色的持续时间(我的示例数据中的列sg.0
),例如计算第一个绿色的所有时间长度状态到每个周期的最后一个绿色状态,我该怎么办?
Data.frame如下所示:
time sg. 0
1 2014-09-01 00:00:12.0 green
2 2014-09-01 00:00:13.5 green
3 2014-09-01 00:00:30.0 amber
4 2014-09-01 00:00:30.0 amber
5 2014-09-01 00:00:31.5 amber
6 2014-09-01 00:00:32.0 amber
7 2014-09-01 00:00:32.2 amber
8 2014-09-01 00:00:33.5 amber
9 2014-09-01 00:00:33.0 red
10 2014-09-01 00:00:35.0 red
11 2014-09-01 00:00:35.2 red
12 2014-09-01 00:00:37.0 red
13 2014-09-01 00:00:41.0 red
14 2014-09-01 00:00:42.0 red
15 2014-09-01 00:00:42.2 red
16 2014-09-01 00:00:43.0 red
17 2014-09-01 00:00:44.7 red
18 2014-09-01 00:00:44.2 red
19 2014-09-01 00:00:45.5 red
20 2014-09-01 00:00:47.0 red
21 2014-09-01 00:00:48.7 red
22 2014-09-01 00:00:49.7 red
23 2014-09-01 00:00:49.7 red
24 2014-09-01 00:00:49.9 red
25 2014-09-01 00:00:50.9 green
26 2014-09-01 00:00:50.0 green
27 2014-09-01 00:00:52.0 green
28 2014-09-01 00:00:53.0 green
29 2014-09-01 00:00:54.0 green
30 2014-09-01 00:00:55.0 green
31 2014-09-01 00:00:55.0 green
32 2014-09-01 00:01:02.0 green
33 2014-09-01 00:01:03.7 green
34 2014-09-01 00:01:05.7 green
35 2014-09-01 00:01:07.0 green
原始数据:
structure(list(time = structure(c(1409518812, 1409518813.6, 1409518830,
1409518830.1, 1409518831.6, 1409518832, 1409518832.2, 1409518833.6,
1409518833, 1409518835, 1409518835.3, 1409518837, 1409518841,
1409518842, 1409518842.3, 1409518843, 1409518844.8, 1409518844.2,
1409518845.6, 1409518847, 1409518848.7, 1409518849.7, 1409518849.8,
1409518849.9, 1409518850.9, 1409518850, 1409518852, 1409518853,
1409518854, 1409518855, 1409518855.1, 1409518862, 1409518863.8,
1409518865.8, 1409518867, 1409518868, 1409518870.7, 1409518870.3,
1409518884, 1409518884.2, 1409518884.3, 1409518884.5, 1409518890,
1409518942, 1409518942.1, 1409518943.7, 1409518943.3, 1409518944.9,
1409518944, 1409518945, 1409518947, 1409518949.5, 1409518949.6,
1409518953, 1409518954, 1409518957.8, 1409518957.2, 1409518961,
1409518961.1, 1409518961.2, 1409518962.2, 1409518962.3, 1409518964,
1409518965, 1409518966, 1409518967, 1409518967.1, 1409518974,
1409518975.8, 1409518977.8, 1409518979, 1409518980, 1409519068,
1409519068.1, 1409519068.7, 1409519070, 1409519071, 1409519073,
1409519073.8, 1409519081, 1409519082, 1409519083.3, 1409519083.8,
1409519084.7, 1409519086, 1409519087.6, 1409519089.2, 1409519089.3,
1409519091, 1409519091.1, 1409519091.6, 1409519092, 1409519092.1,
1409519093, 1409519094, 1409519094.5, 1409519095, 1409519095.1,
1409519103, 1409519104), class = c("POSIXct", "POSIXt")), `sg. 0` = structure(c(2L,
2L, 1L, 1L, 1L, 1L, 1L, 1L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 2L,
2L, 2L, 2L), .Label = c("amber", "green", "red"), class = "factor")), .Names = c("time",
"sg. 0"), row.names = c(NA, 100L), class = "data.frame")
答案 0 :(得分:2)
您可能希望首先唯一地标识每个颜色循环,然后您可以收集每个颜色循环的统计数据。您可以使用
找到循环cycle<-cumsum(c(FALSE, dd[-1,2] != dd[-nrow(dd),2]))
(假设您的data.frame名为dd
)。然后你可以找到从开始到结束的持续时间
tapply(dd[,1], interaction(dd[,2], cycle, drop=T), function(x) diff(range(x)))
给出了
green.0 amber.1 red.2 green.3 amber.4 red.5 green.6 amber.7 red.8 green.9
1.6 3.6 16.9 40.0 2.9 16.2 17.8 2.0 23.5 9.0
或者如果你的意思是格力/琥珀色/红色循环中的循环,你可以做
cycle<-cumsum(c(dd[1,2]!="green", dd[-1,2] == "green" & dd[-nrow(dd),2] !="green"))
tapply(dd[,1], cycle, function(x) as.double(diff(range(x)), units="mins"))
给出了
0 1 2 3
0.6316667 1.8533333 2.2050000 0.1500000
答案 1 :(得分:1)
与MrFlick的方法类似,您可以使用rle
首先为每个颜色周期生成一个指标,然后使用它来计算持续时间。
# If you want to calculate the time within each colour
r <- rle(as.numeric(dat$sg.0))
r$values <- seq_along(r$values)
dat$id <- inverse.rle(r)
(a <- aggregate(time ~ sg.0 + id, dat, function(i) diff(as.numeric(range(i)))))
# sg.0 id time
#1 green 1 1.6
#2 amber 2 3.6
#3 red 3 16.9
# ...
# Use a similar approach, if the cycle is for each green/amber/red
r <- rle(as.numeric(dat$sg.0))
r$values <- rep(seq_along(r$values), each=3, length=length(r$values))
dat$cycle <- inverse.rle(r)
(b <- aggregate(time ~ cycle, dat, function(i) diff(as.numeric(range(i)))))
# cycle time
#1 1 37.9
#2 2 111.2
#3 3 132.3
#4 4 9.0
编辑添加as.numeric
以汇总函数调用,以便在几秒钟内始终如一地报告