我有一个矩阵如下:
TIME YORN
[1,] 24 0
[2,] 26 0
[3,] 28 0
[4,] 30 1
[5,] 32 0
[6,] 34 1
[7,] 36 0
[8,] 38 0
[9,] 40 0
[10,] 42 0
[11,] 44 1
[12,] 45 0
[13,] 48 1
[14,] 50 1
[15,] 53 1
[16,] 54 1
[17,] 56 1
[18,] 58 0
[19,] 60 1
[20,] 62 0
[21,] 64 1
[22,] 67 1
[23,] 68 1
[24,] 70 1
[25,] 72 1
[26,] 74 1
[27,] 89 1
我想计算“时间”的总持续时间,其中“YORN”值连续不止一次地保持为1(而不是立即变为0)。
如何在R?
中实现这一目标答案 0 :(得分:0)
这是一个可能的rle
解决方案(我无法想象如何简化这一点),假设dat
是你的矩阵
temp <- rle(dat[, 2] == 1L) # capture sequences of 1
temp$values[temp$lengths == 1L] <- FALSE # set all the smaller than 2 sequences to FALSE
indx <- inverse.rle(temp) # reverse back to the original vector size but with correct indexes
indx2 <- cumsum(c(1L, diff(which(indx == 1L))) > 1L) # separate to groups
sum(tapply(dat[indx, 1], indx2, function(x) diff(range(x)))) # sum the differences
## [1] 33
答案 1 :(得分:0)
如果正确的结果是33:
m <- structure(c(24L, 26L, 28L, 30L, 32L, 34L, 36L, 38L, 40L, 42L,
44L, 45L, 48L, 50L, 53L, 54L, 56L, 58L, 60L, 62L, 64L, 67L, 68L,
70L, 72L, 74L, 89L, 0L, 0L, 0L, 1L, 0L, 1L, 0L, 0L, 0L, 0L, 1L,
0L, 1L, 1L, 1L, 1L, 1L, 0L, 1L, 0L, 1L, 1L, 1L, 1L, 1L, 1L, 1L
), .Dim = c(27L, 2L), .Dimnames = list(NULL, c("TIME", "YORN"
)))
start <- c(m[1,2] == 1L, diff(m[,2]) == 1L)
end <- c(diff(m[,2]) == -1L, m[nrow(m),2] == 1L)
sum(m[end, 1] - m[start, 1])
#[1] 33
否则,你需要调整它。
答案 2 :(得分:0)
dplyr解决方案:
df1 %>%
mutate(
changed = !is.na(lag(YORN)) & YORN != lag(YORN)) %>%
group_by(cumsum(changed), YORN) %>%
filter(min(TIME) != max(TIME) & YORN == 1) %>%
summarize(TOTAL = sum(TIME - lag(TIME), na.rm = TRUE )) %>%
ungroup() %>%
summarize(TOTAL = sum(TOTAL))