Creating windows of a sequence in r

时间:2015-05-04 19:26:59

标签: r substring

I have a character string for which I want to store all possible windows of a given size. In my first approach I have done the following to accomplish this:

teststring<-c("ABCBDBDBCBABABD")
n <- nchar(teststring) 
k <- 9 
test = substring(teststring, 1:(n-k+1), k:n)
test1 = strsplit(test,"")
test2 = do.call(cbind, test1)

But I was wondering whether there is a better way to do it? I would also need to find a way to do this in a cyclical manner, i.e.:

A  B  C  B  D  B  D  B  C   B   A   B   A   etc.
B  C  B  D  B  D  B  C  B   A   B   A   B   
C  B  D  B  D  B  C  B  A   B   A   B   D   
B  D  B  D  B  C  B  A  B   A   B   D   A   
D  B  D  B  C  B  A  B  A   B   D   A   B   
B  D  B  C  B  A  B  A  B   D   A   B   C   
D  B  C  B  A  B  A  B  D   A   B   C   B   
B  C  B  A  B  A  B  D  A   B   C   B   D 
C  B  A  B  A  B  D  A  B   C   B   D   B  

The following does not work, as is outlined in the documentation for substring:

test = substring(teststring, 1:n, c(k:n, 1:k))

Any help would be appreciated.

2 个答案:

答案 0 :(得分:2)

For cyclical, I'd just paste your test-string together. Then substring is easy:

t2 = paste0(teststring, teststring)
substring(text = t2, 1:n, last = (1:n) + k)
# [1] "ABCBDBDBCB" "BCBDBDBCBA" "CBDBDBCBAB" "BDBDBCBABA" "DBDBCBABAB" "BDBCBABABD" "DBCBABABDA"
# [8] "BCBABABDAB" "CBABABDABC" "BABABDABCB" "ABABDABCBD" "BABDABCBDB" "ABDABCBDBD" "BDABCBDBDB"
# [15] "DABCBDBDBC"

If you'd prefer it in a matrix, you can always strsplit after, as akrun suggested in comments.

答案 1 :(得分:2)

If you need to create windows of sequence as matrices

v1 <- strsplit(teststring,'')[[1]]
ln <- length(v1)
m1 <- embed(c(v1,v1), ln)
m1[,ncol(m1):1]

Or

matrix(v1, nrow=ln+1, ncol=ln)[-(ln+1),]