项目的文件内填充填充

时间:2020-08-23 21:54:43

标签: bash

Am目前正在解析一些网站,以提高我的Unix Bash技能。 已提取以下格式的文件

la-que-no-podia-capitulo-1
la-que-no-podia-capitulo-25
la-que-no-podia-capitulo-30

并希望到达这一步

la-que-no-podia-capitulo-001
la-que-no-podia-capitulo-025
la-que-no-podia-capitulo-030

有人可以帮助我吗? 我尝试了不同的方法:

  1. Bash RegExp

    x='a-que-no-me-dejas-capitulo-10'
    re='((([[:alpha:]]+(-))+)[[:digit:]]+)'
    if [[ $x =~ $re ]]
    then
        echo The regex matches!
        echo ${BASH_REMATCH[*]}
    fi
    

    (以利用https://stackoverflow.com/a/63551084/10906045

    但是不幸的是,它并没有分割最后一个数字。

  2. AWK

    awk -F'-' '{ printf "%04d: \n", $NF }' output_downloads >output_downloads2
    head output_downloads2
    
    0001: 
    0002: 
    0003: 
    0004: 
    0050: 
    

    我无法提取第一部分。

1 个答案:

答案 0 :(得分:4)

使用awk

awk '{ match($0, /(.*-)([[:digit:]]+)$/, m); printf("%s%03d\n", m[1], m[2])}' inputfile

这是实际的awk脚本:

{
  # Regex match whole line with 2 capture groups
  match($0, /(.*-)([[:digit:]]+)$/, m)

  # Format print both captured groups
  printf("%s%03d\n", m[1], m[2])
}

使用Bash ERE:

while IFS= read -r || [[ $REPLY ]]; do
 # Regex match whole line with 2 capture groups
 [[ $REPLY =~ (.*-)([[:digit:]]+)$ ]] || :

 # Format print both captured groups
 printf '%s%03d\n' "${BASH_REMATCH[1]}" "${BASH_REMATCH[2]}"
done <inputfile

或使用POSIX shell:

#!/usr/bin/env sh

while IFS= read -r line || [ "$line" ]; do
  IFS=-
  # Split line on dashes and fills the arguments array
  # shellcheck disable=SC2086 # Intended word splitting
  set -- $line
  # Format print arguments followed by dash except last one
  while [ $# -gt 1 ]; do
    printf '%s-' "$1"
    shift
  done
  # Format print last argument as 0-padded, 3 digits integer and newline
  printf '%03d\n' "$1"
done <inputfile