R中字母数字范围内的所有字符串

时间:2018-08-29 21:46:12

标签: r regex string

在R中,如何在给定字符串“范围”的情况下获取字符串向量? (与x:y等效,但允许x和y为相同长度的字符串)几个示例...

“ A001”:“ A003” == c(“ A001”,“ A002”,“ A003”)

“ A99”:“ B02” == c(“ A99”,“ B00”,“ B01”,“ B02”)

更新: 我可以使用

获取“ A01”:“ A10”
    paste0('A',sprintf("%02d",1:10))
    [1] "A01" "A02" "A03" "A04" "A05" "A06" "A07" "A08" "A09" "A10"

但是我不确定如何无缝地从A到B(即“ A99:B02”)。

3 个答案:

答案 0 :(得分:0)

是的!这是您的工作示例。 outer函数可生成矩阵,但您始终可以使用嵌套的listunlist包装器进行修复

您可以使用一个单独的nums来执行此操作,除非您指定了想要两位数的代码,因此额外的paste0为1:10

请注意,这不适用于您从“ A99”到“ B02”的“环绕式”情况,但是生成一个更大的列表然后将其子集化可能更容易

letts <- c("a", "b", "c")
nums <- c(paste0("0", c(0:9)), 10:99)
sort(unlist(list(outer(letts, nums, paste0))))

答案 1 :(得分:0)

Punintended的回答启发了我找出一个可行的解决方案(包括从“ A99”到“ B02”的“环绕式”案例)

stringrange = function(x,y){
    full =unlist(lapply(LETTERS[which(LETTERS==substr(x,1,1)):which(LETTERS==substr(y,1,1))],
            function(x){paste0(x,gettextf(paste0("%02d"),0:99))}))
    full[which(full==x):which(full==y)]
    }

>stringrange("A98","C03")
      [1] "A98" "A99" "B00" "B01" "B02" "B03" "B04" "B05" "B06" "B07" "B08" "B09" "B10" "B11" "B12" "B13" "B14" "B15" "B16" "B17" "B18"
     [22] "B19" "B20" "B21" "B22" "B23" "B24" "B25" "B26" "B27" "B28" "B29" "B30" "B31" "B32" "B33" "B34" "B35" "B36" "B37" "B38" "B39"
     [43] "B40" "B41" "B42" "B43" "B44" "B45" "B46" "B47" "B48" "B49" "B50" "B51" "B52" "B53" "B54" "B55" "B56" "B57" "B58" "B59" "B60"
     [64] "B61" "B62" "B63" "B64" "B65" "B66" "B67" "B68" "B69" "B70" "B71" "B72" "B73" "B74" "B75" "B76" "B77" "B78" "B79" "B80" "B81"
     [85] "B82" "B83" "B84" "B85" "B86" "B87" "B88" "B89" "B90" "B91" "B92" "B93" "B94" "B95" "B96" "B97" "B98" "B99" "C00" "C01" "C02"
    [106] "C03"

答案 2 :(得分:0)

您可以这样做:

str_seq=function(X){
    a = toupper(strsplit(X, ':')[[1]])# split while ensuring the letters are uppercase
    nums = as.numeric(sub('[A-Z]', '', a))# Obtain the numbers
    stopifnot(nums < 100) # If the number for a range is greater than 100 produce an error
    a = as.numeric(paste0(setNames(1:26, LETTERS)[sub('\\d+', '', a)],sprintf("%02d", nums)))
    b = do.call(seq, as.list(c(a, by = if (diff(a) > 0) 1 else -1))) # Ensure you can go forward or backward
    b = b[!!b%%100] # Remove A00,B00, etc
    paste0(LETTERS[b %/% 100], sprintf("%02d", b %% 100))
 }

str_seq('i89:j23')
 [1] "I89" "I90" "I91" "I92" "I93" "I94" "I95" "I96" "I97" "I98" "I99" "J01" "J02" "J03" "J04"
[16] "J05" "J06" "J07" "J08" "J09" "J10" "J11" "J12" "J13" "J14" "J15" "J16" "J17" "J18" "J19"
[31] "J20" "J21" "J22" "J23"
> str_seq('j23:i89')
 [1] "J23" "J22" "J21" "J20" "J19" "J18" "J17" "J16" "J15" "J14" "J13" "J12" "J11" "J10" "J09"
[16] "J08" "J07" "J06" "J05" "J04" "J03" "J02" "J01" "I99" "I98" "I97" "I96" "I95" "I94" "I93"
[31] "I92" "I91" "I90" "I89"