在字符串中添加前导零

时间:2015-12-01 17:36:05

标签: regex r

我有一系列列名,我试图将其标准化。

names <- c("apple", "banana", "orange", "apple1", "apple2", "apple10", "apple11", "banana2", "banana12")

我希望任何具有一位数字的东西都用零填充,所以

apple
banana
orange
apple01
apple02
apple10
apple11
banana02
...

我一直在尝试使用stringr

strdouble <- str_detect(names, "[0-9]{2}")
strsingle <- str_detect(names, "[0-9]")

str_detect(names[strsingle & !strdouble])

但无法弄清楚如何有选择地替换/前置......

4 个答案:

答案 0 :(得分:8)

您可以使用sub("([a-z])([0-9])$","\\10\\2",names)

[1] "apple"    "banana"   "orange"   "apple01"  "apple02"  "apple10"  "apple11"  "banana02"
[9] "banana12"

它只更改字母后面有一个数字的名称($是字符串的结尾)。

\\1选择()中的第一个块:字母。然后它将前导0,然后是()中的第二个块:数字。

答案 1 :(得分:6)

这是一个使用负前瞻和后瞻断言来识别单个数字的选项。

gsub('(?<!\\d)(\\d)(?!\\d)', '0\\1', names, perl=TRUE)
# [1] "apple"    "banana"   "orange"   "apple01"  "apple02"  "apple10"  "apple11"  "banana02" "banana12"

答案 2 :(得分:1)

来自stringr的

str_pad

library(stringr)

pad_if = function(x, cond, n, fill = "0") str_pad(x, n*cond, pad = fill)

s = str_split_fixed(names,"(?=\\d)",2)
#       [,1]     [,2]
#  [1,] "apple"  ""  
#  [2,] "banana" ""  
#  [3,] "orange" ""  
#  [4,] "apple"  "1" 
#  [5,] "apple"  "2" 
#  [6,] "apple"  "10"
#  [7,] "apple"  "11"
#  [8,] "banana" "2" 
#  [9,] "banana" "12"

paste0(s[,1], pad_if(s[,2], cond = nchar(s[,2]) > 0, n = max(nchar(s[,2]))))
# [1] "apple"    "banana"   "orange"   "apple01"  "apple02"  "apple10"  "apple11"  "banana02" "banana12"

这也延伸到从c("a","a2","a20","a202")c("a","a002","a020","a202")的情况,而另一种方法则无法涵盖。

stringr包基于stringi,它具有此处使用的所有相同功能,我猜测。

来自基地的

sprintf ,采用类似的方法:

pad_if2 = function(x, cond, n, fill = "0") 
  replace(x, cond, sprintf(paste0("%",fill,n,"d"), as.numeric(x)[cond]))

s0 = strsplit(names,"(?<=\\D)(?=\\d)|$",perl=TRUE)

s1 = sapply(s0,`[`,1)
s2 = sapply(sapply(s0,`[`,-1), paste0, "")

paste0(s1, pad_if2(s2, cond = nchar(s2) > 0, n = max(nchar(s2))))

pad_if2的使用率低于pad_if,因为它要求x可以强制数字化。这里的每一步都比上面提到的包的相应代码更笨拙。

答案 3 :(得分:0)

关键是在数字前用$和字母标识单个数字。可以尝试:

gsub('[^0-9]([0-9])$','0\\1',names)
[1] "apple"    "banana"   "orange"   "appl01"   "appl02"   "apple10"  "apple11"  "banan02"  "banana12"

或前瞻。

gsub('(?<=[a-z])(\\d)$','0\\1',names,perl=T)