我有一个看起来像这样的载体
c(3,4,5,6,7,10,11,14,17,18,19,54,55,56,59,61)->k
如何轻松检测连续数字的范围以便
3:7,10,11,14,17:19,54:56,59,61
并将其保存在新的矢量中?在那些存在范围(:)的情况下,将此范围的中位数保存为好,以便输出
5,10,11,14,18,55,59,61
是否有任何其他解决方案快速,也可以处理不像这样提升的向量 12,3,4,5,0,7
成 12,4,0,7
答案 0 :(得分:5)
1)试试这个:
tapply(k, cumsum(c(TRUE, diff(k) != 1)), median)
,并提供:
1 2 3 4 5 6 7
5.0 10.5 14.0 18.0 55.0 59.0 61.0
2)也可以尝试:
f <- function(x) if (length(x) == 1) x else paste(x[1], x[length(x)], sep = ":")
tapply(k, cumsum(c(TRUE, diff(k) != 1)), f)
,并提供:
1 2 3 4 5 6 7
"3:7" "10:11" "14" "17:19" "54:56" "59" "61"
3)而且:
tapply(k, cumsum(c(TRUE, diff(k) != 1)), toString)
给出这个:
1 2 3 4 5
"3, 4, 5, 6, 7" "10, 11" "14" "17, 18, 19" "54, 55, 56"
6 7
"59" "61"
4)以及:
split(k, cumsum(c(TRUE, diff(k) != 1)))
,并提供:
$`1`
[1] 3 4 5 6 7
$`2`
[1] 10 11
$`3`
[1] 14
$`4`
[1] 17 18 19
$`5`
[1] 54 55 56
$`6`
[1] 59
$`7`
[1] 61
以上都不需要任何外部包。
答案 1 :(得分:2)
使用vapply
和range
(仅base R
个功能)的选项
f1 <- function(x) paste(unique(range(x)), collapse=":")
vapply(split(k, cumsum(c(TRUE,diff(k)!=1))), f1, character(1L))
# 1 2 3 4 5 6 7
# "3:7" "10:11" "14" "17:19" "54:56" "59" "61"
或者如果您需要median
vapply(split(k, cumsum(c(TRUE,diff(k)!=1))), FUN= median, double(1L))
# 1 2 3 4 5 6 7
# 5.0 10.5 14.0 18.0 55.0 59.0 61.0
对于大型载体,正如@David Arenburg在评论中提到的,一些data.table
选项是
library(data.table)
as.data.table(k)[, median(k), cumsum(c(TRUE, diff(k) != 1))]
as.data.table(k)[, paste(unique(range(k)), collapse = ";"),
cumsum(c(TRUE, diff(k) != 1))
使用新的矢量“k1”
k1 <- c(12,3,4,5,0,7)
vapply(split(k1, cumsum(c(TRUE, diff(k1)!=1))), FUN=median,
double(1L))
# 1 2 3 4
#12 4 0 7
as.data.table(k1)[, median(k1) ,cumsum(c(TRUE, diff(k1)!=1))]
# cumsum V1
# 1: 1 12
# 2: 2 4
# 3: 3 0
# 4: 4 7