R计算向量中的reoccuring实例

时间:2018-04-09 21:00:14

标签: r

我正在寻找一种使用R将矢量v <- c(a,a,a,b,a,b,b,c,b)转换为压缩矢量res <- c(a3,b1,a1,b2,c1,b1)的有效方法。

{a,b,c}也可能更长,例如{alfonso,berta,cesar}。

3 个答案:

答案 0 :(得分:4)

需要rle和paste0,但首先是代码中的一个例子。:

v <- scan(text= "a,a,a,b,a,b,b,c,b", sep=",", what="")
#Read 9 items
 v
[1] "a" "a" "a" "b" "a" "b" "b" "c" "b"
 rle(v)
#Run Length Encoding
#  lengths: int [1:6] 3 1 1 2 1 1
#  values : chr [1:6] "a" "b" "a" "b" "c" "b"
 paste0( rle(v)$values, rle(v)$lengths)
##
[1] "a3" "b1" "a1" "b2" "c1" "b1"

答案 1 :(得分:1)

选项可以使用dplyr::lag生成输出:

#Data
v <- c("a","a","a","b","a","b","b","c","b")
sapply(split(v, cumsum(v!=dplyr::lag(v, default = " "))),
       function(x)paste0(x[1],length(x))) %>% as.vector()
#Result
#[1] "a3" "b1" "a1" "b2" "c1" "b1"

答案 2 :(得分:0)

您正在描述Run-length encoding

R将此提供为base::rle

要创建单个字符串,您可以对rlevalueslengths)提供的两个结果进行行绑定,然后将它们折叠在一起:

paste(rbind(rle(v)$values, rle(v)$lengths), collapse="")