R-在连续时间戳中找到最大值(此处为秒)

时间:2019-11-20 08:20:51

标签: r

我有以下数据。我想找到连续的秒数,然后仅选择较大的Value2。如果您查看下面的内容,那么我们会在4月13日04:25连续3秒发生事件。现在我只想选择一个较大的值2,即04:25:18。

time        Value1      Value2
2018-04-13 01:19:04 0.09760860  68.41634
2018-04-13 01:20:10 0.24585245  32.94790
2018-04-13 01:21:16 0.24487727  28.99412
2018-04-13 01:54:06 0.22994130  37.63333
2018-04-13 03:27:17 0.11139787  83.40588
2018-04-13 03:36:20 0.04642794 102.15588
2018-04-13 03:37:39 0.04144001 109.93137
2018-04-13 03:38:17 0.03933649 106.77124
2018-04-13 04:04:15 0.27627418  42.60554
2018-04-13 04:13:24 0.11536228  65.87941
2018-04-13 04:13:25 0.14011963  66.10706
2018-04-13 04:13:46 0.09159499  70.96471
2018-04-13 04:24:27 0.03760945 120.97294
2018-04-13 04:24:39 0.02905284 116.59853
2018-04-13 04:24:41 0.02751022 116.32059
2018-04-13 04:24:42 0.03271061 116.60840
2018-04-13 04:24:43 0.02836884 116.32471
2018-04-13 04:25:09 0.02983106 117.32745
2018-04-13 04:25:18 0.03332321 118.45747
2018-04-13 04:25:19 0.03218042 117.61882
2018-04-13 04:25:20 0.02625636 118.06667

预期输出如下:

time        Value1      Value2
2018-04-13 01:19:04 0.09760860  68.41634
2018-04-13 01:20:10 0.24585245  32.94790
2018-04-13 01:21:16 0.24487727  28.99412
2018-04-13 01:54:06 0.22994130  37.63333
2018-04-13 03:27:17 0.11139787  83.40588
2018-04-13 03:36:20 0.04642794 102.15588
2018-04-13 03:37:39 0.04144001 109.93137
2018-04-13 03:38:17 0.03933649 106.77124
2018-04-13 04:04:15 0.27627418  42.60554
2018-04-13 04:13:25 0.14011963  66.10706
2018-04-13 04:13:46 0.09159499  70.96471
2018-04-13 04:24:27 0.03760945 120.97294
2018-04-13 04:24:39 0.02905284 116.59853
2018-04-13 04:24:42 0.03271061 116.60840
2018-04-13 04:25:09 0.02983106 117.32745
2018-04-13 04:25:18 0.03332321 118.45747

我正在尝试使用RLE对象。然后找到连续的秒并在其中找到最大值。但是,我收效甚微。

1 个答案:

答案 0 :(得分:1)

这是dplyr的一种方法,当行之间的连续差异大于1时,我们创建新的组,然后从每个组中选择max Value2

library(dplyr)
df %>%
  mutate(time = as.POSIXct(time)) %>%
  group_by(group = cumsum(time -  lag(time, default = first(time)) != 1)) %>%
  slice(which.max(Value2)) %>%
  ungroup() %>%
  select(-group)

# A tibble: 16 x 3
#   time                Value1 Value2
#   <dttm>               <dbl>  <dbl>
# 1 2018-04-13 01:19:04 0.0976   68.4
# 2 2018-04-13 01:20:10 0.246    32.9
# 3 2018-04-13 01:21:16 0.245    29.0
# 4 2018-04-13 01:54:06 0.230    37.6
# 5 2018-04-13 03:27:17 0.111    83.4
# 6 2018-04-13 03:36:20 0.0464  102. 
# 7 2018-04-13 03:37:39 0.0414  110. 
# 8 2018-04-13 03:38:17 0.0393  107. 
# 9 2018-04-13 04:04:15 0.276    42.6
#10 2018-04-13 04:13:25 0.140    66.1
#11 2018-04-13 04:13:46 0.0916   71.0
#12 2018-04-13 04:24:27 0.0376  121. 
#13 2018-04-13 04:24:39 0.0291  117. 
#14 2018-04-13 04:24:42 0.0327  117. 
#15 2018-04-13 04:25:09 0.0298  117. 
#16 2018-04-13 04:25:18 0.0333  118. 

数据

df <-  structure(list(time = structure(1:21, .Label = c("2018-04-13 01:19:04", 
"2018-04-13 01:20:10", "2018-04-13 01:21:16", "2018-04-13 01:54:06", 
"2018-04-13 03:27:17", "2018-04-13 03:36:20", "2018-04-13 03:37:39", 
"2018-04-13 03:38:17", "2018-04-13 04:04:15", "2018-04-13 04:13:24", 
"2018-04-13 04:13:25", "2018-04-13 04:13:46", "2018-04-13 04:24:27", 
"2018-04-13 04:24:39", "2018-04-13 04:24:41", "2018-04-13 04:24:42", 
"2018-04-13 04:24:43", "2018-04-13 04:25:09", "2018-04-13 04:25:18", 
"2018-04-13 04:25:19", "2018-04-13 04:25:20"), class = "factor"), 
Value1 = c(0.0976086, 0.24585245, 0.24487727, 0.2299413, 
0.11139787, 0.04642794, 0.04144001, 0.03933649, 0.27627418, 
0.11536228, 0.14011963, 0.09159499, 0.03760945, 0.02905284, 
0.02751022, 0.03271061, 0.02836884, 0.02983106, 0.03332321, 
0.03218042, 0.02625636), Value2 = c(68.41634, 32.9479, 28.99412, 
37.63333, 83.40588, 102.15588, 109.93137, 106.77124, 42.60554, 
65.87941, 66.10706, 70.96471, 120.97294, 116.59853, 116.32059, 
116.6084, 116.32471, 117.32745, 118.45747, 117.61882, 118.06667
)), class = "data.frame", row.names = c(NA, -21L))