我正在寻找一种使用R将矢量v <- c(a,a,a,b,a,b,b,c,b)
转换为压缩矢量res <- c(a3,b1,a1,b2,c1,b1)
的有效方法。
{a,b,c}也可能更长,例如{alfonso,berta,cesar}。
答案 0 :(得分:4)
需要rle和paste0,但首先是代码中的一个例子。:
v <- scan(text= "a,a,a,b,a,b,b,c,b", sep=",", what="")
#Read 9 items
v
[1] "a" "a" "a" "b" "a" "b" "b" "c" "b"
rle(v)
#Run Length Encoding
# lengths: int [1:6] 3 1 1 2 1 1
# values : chr [1:6] "a" "b" "a" "b" "c" "b"
paste0( rle(v)$values, rle(v)$lengths)
##
[1] "a3" "b1" "a1" "b2" "c1" "b1"
答案 1 :(得分:1)
选项可以使用dplyr::lag
生成输出:
#Data
v <- c("a","a","a","b","a","b","b","c","b")
sapply(split(v, cumsum(v!=dplyr::lag(v, default = " "))),
function(x)paste0(x[1],length(x))) %>% as.vector()
#Result
#[1] "a3" "b1" "a1" "b2" "c1" "b1"
答案 2 :(得分:0)
您正在描述Run-length encoding。
R将此提供为base::rle
要创建单个字符串,您可以对rle
(values
和lengths
)提供的两个结果进行行绑定,然后将它们折叠在一起:
paste(rbind(rle(v)$values, rle(v)$lengths), collapse="")