Question

我有一个矩阵如下：

        TIME  YORN
 [1,]   24  0
 [2,]   26  0
 [3,]   28  0
 [4,]   30  1
 [5,]   32  0
 [6,]   34  1
 [7,]   36  0
 [8,]   38  0
 [9,]   40  0
[10,]   42  0
[11,]   44  1
[12,]   45  0
[13,]   48  1
[14,]   50  1
[15,]   53  1
[16,]   54  1
[17,]   56  1
[18,]   58  0
[19,]   60  1
[20,]   62  0
[21,]   64  1
[22,]   67  1
[23,]   68  1
[24,]   70  1
[25,]   72  1
[26,]   74  1
[27,]   89  1

我想计算“时间”的总持续时间，其中“YORN”值连续不止一次地保持为1（而不是立即变为0）。

如何在R？

中实现这一目标

Answer 1

这是一个可能的rle解决方案（我无法想象如何简化这一点），假设dat是你的矩阵

temp <- rle(dat[, 2] == 1L) # capture sequences of 1
temp$values[temp$lengths == 1L] <- FALSE # set all the smaller than 2 sequences to FALSE
indx <- inverse.rle(temp) # reverse back to the original vector size but with correct indexes
indx2 <- cumsum(c(1L, diff(which(indx == 1L))) > 1L) # separate to groups
sum(tapply(dat[indx, 1], indx2, function(x) diff(range(x)))) # sum the differences
## [1] 33

Answer 2

如果正确的结果是33：

m <- structure(c(24L, 26L, 28L, 30L, 32L, 34L, 36L, 38L, 40L, 42L, 
44L, 45L, 48L, 50L, 53L, 54L, 56L, 58L, 60L, 62L, 64L, 67L, 68L, 
70L, 72L, 74L, 89L, 0L, 0L, 0L, 1L, 0L, 1L, 0L, 0L, 0L, 0L, 1L, 
0L, 1L, 1L, 1L, 1L, 1L, 0L, 1L, 0L, 1L, 1L, 1L, 1L, 1L, 1L, 1L
), .Dim = c(27L, 2L), .Dimnames = list(NULL, c("TIME", "YORN"
)))

start <- c(m[1,2] == 1L, diff(m[,2]) == 1L)
end <- c(diff(m[,2]) == -1L, m[nrow(m),2] == 1L)

sum(m[end, 1] - m[start, 1])
#[1] 33

否则，你需要调整它。

Answer 3

dplyr解决方案：

df1 %>%
  mutate(
    changed = !is.na(lag(YORN)) & YORN != lag(YORN)) %>%
  group_by(cumsum(changed), YORN) %>%
  filter(min(TIME) != max(TIME) & YORN == 1) %>%
  summarize(TOTAL = sum(TIME - lag(TIME), na.rm = TRUE )) %>%
  ungroup() %>%
  summarize(TOTAL = sum(TOTAL))

如何计算值在R中保持不变的持续时间？

3 个答案: