使用bash(sed或awk?)在特定位置插入计数

时间:2015-07-16 11:24:09

标签: regex bash awk sed pacman

我有一个包含这样镜像的文件(mirrorlist.pacnew):

prakhar@inS4n3 ~ $ cat /etc/pacman.d/mirrorlist.pacnew 
...
## Worldwide
#Server = https://dgix.ru/mirrors/archlinux/$repo/os/$arch
#Server = http://mirror.rackspace.com/archlinux/$repo/os/$arch

## Australia
#Server = http://mirror.aarnet.edu.au/pub/archlinux/$repo/os/$arch
...

我应该选择镜子并取消注释。但是,工具rankmirrors会为我确定最佳镜像,因此我使用sed 取消注释所有镜像

prakhar@inS4n3 ~ $ cat /etc/pacman.d/mirrorlist.pacnew | sed -r 's/^#([^#]+)/#\1\n\1/'
...
## Worldwide
#Server = https://dgix.ru/mirrors/archlinux/$repo/os/$arch
Server = https://dgix.ru/mirrors/archlinux/$repo/os/$arch
#Server = http://mirror.rackspace.com/archlinux/$repo/os/$arch
Server = http://mirror.rackspace.com/archlinux/$repo/os/$arch

## Australia
#Server = http://mirror.aarnet.edu.au/pub/archlinux/$repo/os/$arch
Server = http://mirror.aarnet.edu.au/pub/archlinux/$repo/os/$arch
...

我保留了注释行,因为rankmirrors会打印它们,我可以跟踪进度(它不会打印正在处理的未注释的行)。

但是,我希望sedawk在每一行中打印服务器数和总计数

具体做法是:

  1. 取消注释行,我上面给出了一个例子。
  2. 从原始文件打印当前 #Server 索引的索引(不是实际行号,因为该文件包含县名,通用注释)。
  3. 最终输出看起来有点像这样:

    #22/247 Server = http://mirror.aarnet.edu.au/pub/archlinux/$repo/os/$arch
    Server = http://mirror.aarnet.edu.au/pub/archlinux/$repo/os/$arch
    

    Here是完整档案的副本。

    编辑:

    我自己取得了一些进展,并将我的工作添加为an answer,因为它实现了上述目标,但并非最佳。

2 个答案:

答案 0 :(得分:2)

将同一文件传递两次到awk。第一次通过,得到计数。第二遍,替补。

awk 'NR==FNR {
         if( /^#Server *=/)count++;
         next;
     }
     /#Server *=/{
         sub(/^#*/,"");
         print "#" ++i "/" count " " $0;
    }
    1' serverlist serverlist

给出:

## Worldwide
#1/3 Server = https://dgix.ru/mirrors/archlinux/$repo/os/$arch
Server = https://dgix.ru/mirrors/archlinux/$repo/os/$arch
#2/3 Server = http://mirror.rackspace.com/archlinux/$repo/os/$arch
Server = http://mirror.rackspace.com/archlinux/$repo/os/$arch

## Australia
#3/3 Server = http://mirror.aarnet.edu.au/pub/archlinux/$repo/os/$arch
Server = http://mirror.aarnet.edu.au/pub/archlinux/$repo/os/$arch

答案 1 :(得分:1)

仅限{p> sedgrep

prakhar@inS4n3 ~ $ COUNT=$(grep -c "Server" /etc/pacman.d/mirrorlist.pacnew); cat /etc/pacman.d/mirrorlist.pacnew | sed -r 's/^#([^#]+)/\1/;tx;d;:x'| sed = | sed 'N;s/\n/ /' | sed -r 's/([0-9]+?)\sServer\s=\s(.*)/#\1 \/ '$COUNT' Trying \2\nServer = \2/'
...
#241 / 247 Trying http://mirrors.rutgers.edu/archlinux/$repo/os/$arch
Server = http://mirrors.rutgers.edu/archlinux/$repo/os/$arch
#242 / 247 Trying http://mirror.umd.edu/archlinux/$repo/os/$arch
Server = http://mirror.umd.edu/archlinux/$repo/os/$arch
#243 / 247 Trying http://mirror.vtti.vt.edu/archlinux/$repo/os/$arch
Server = http://mirror.vtti.vt.edu/archlinux/$repo/os/$arch
#244 / 247 Trying http://mirrors.xmission.com/archlinux/$repo/os/$arch
...

<强> TODO

  1. 我很确定这不是最佳的。
  2. 难以阅读
  3. 删除通用评论( #Worldwide
  4. 编辑:处理通用评论:

    user@host $ RANDOM_CHARACTER='@'
    user@host $ sed ':b;N; $!bb; s|\n|'"$RANDOM_CHARACTER"'|g;s/#Server/#\nServer/g' /etc/pacman.d/mirrorlist.pacnew | \
        sed '2,$=' | \
        sed -r '/^[0-9]*$/{s|(.*)|echo "$((\1-1))/'$COUNT' "|e; N; s|\n([^'"$RANDOM_CHARACTER"']*)|\1'"$RANDOM_CHARACTER"'\1|}' | \
        sed ':b;N; $!bb;s|\n||g;s|'"$RANDOM_CHARACTER"'|\n|g'
    

    根据文件内容选择随机字符 - 文件和文件中不存在的任何字符。不用作sed命令的分隔符。