我正在尝试查看是否有更好的方法将矢量拆分为列表,以便将所有连续的唯一值放在一个组中。
请注意,当x
也是字符时,该方法必须正常工作。
#DATA
x = c(0, 0, 0, 7, 7, 7, 7, 0, 0, 0, 0, 0, 0, 0, 7, 7, 7, 7)
x
#[1] 0 0 0 7 7 7 7 0 0 0 0 0 0 0 7 7 7 7
#DESIRED OUTPUT
L = list(c(0, 0, 0), c(7, 7, 7, 7), c(0, 0, 0, 0, 0, 0, 0), c(7, 7, 7, 7))
L
#[[1]]
#[1] 0 0 0
#[[2]]
#[1] 7 7 7 7
#[[3]]
#[1] 0 0 0 0 0 0 0
#[[4]]
#[1] 7 7 7 7
#CURRENT APPROACH
split_vector = 0
for (i in 2:length(x)){
split_vector[i] = ifelse(x[i] != x[i-1], max(split_vector) + 1, split_vector[i-1])
}
split(x, split_vector)
#$`0`
#[1] 0 0 0
#$`1`
#[1] 7 7 7 7
#$`2`
#[1] 0 0 0 0 0 0 0
#$`3`
#[1] 7 7 7 7
答案 0 :(得分:5)
以下是一些替代方案:
1)将rle
与rep
一起使用以形成分组向量并对其进行拆分。没有包使用。
split(x, with(rle(x), rep(seq_along(values), lengths)))
,并提供:
$`1`
[1] 0 0 0
$`2`
[1] 7 7 7 7
$`3`
[1] 0 0 0 0 0 0 0
$`4`
[1] 7 7 7 7
2)使用data.table包中的rleid
更容易:
library(data.table)
split(x, rleid(x))
答案 1 :(得分:3)
tapply(x, cumsum(c(TRUE, diff(x) != 0)), identity)
$`1`
[1] 0 0 0
$`2`
[1] 7 7 7 7
$`3`
[1] 0 0 0 0 0 0 0
$`4`
[1] 7 7 7 7
# Character example
x <- rep(c("a", "b", "c", "a"), c(4, 3, 2, 4))
x
[1] "a" "a" "a" "a" "b" "b" "b" "c" "c" "a" "a" "a" "a"
# Character version
tapply(x, cumsum(c(TRUE, x[-1] != x[-length(x)])), identity)
$`1`
[1] "a" "a" "a" "a"
$`2`
[1] "b" "b" "b"
$`3`
[1] "c" "c"
$`4`
[1] "a" "a" "a" "a"