在Awk中对块内的行进行排序

时间:2018-05-18 16:48:50

标签: arrays sorting awk

我有一个很长的文件,其中包含依赖项列表,它们的版本以及依赖项所属的服务。该文件按块分类和分隔。

以下是我所指的文件档案的片段:

foo.bar.baz:json:jar:2.2.2:compile service: ServiceTwo
foo.bar.baz:json:jar:2.2.10:compile service: ServiceThree
@
asm:asm:jar:3.3.1:compile service: ServiceOne
asm:asm:jar:3.3.1:compile service: ServiceTwo
asm:asm:jar:3.3.0:compile service: ServiceThree
@
hi.bye:beatles:jar:1.6:compile service: ServiceOne
hi.bye:beatles:jar:1.5:compile service: ServiceTwo
hi.bye:beatles:jar:1.15:compile service: ServiceThree
@

如果您注意到:在每个依赖项块中,版本号有点从最高到最低排序。我正在尝试编写一个awk脚本,它将对各自块中的每一行进行排序,从最高版本号到最低版本号。这是输出应该是什么样的:

foo.bar.baz:json:jar:2.2.10:compile service: ServiceThree
foo.bar.baz:json:jar:2.2.2:compile service: ServiceTwo
@
asm:asm:jar:3.3.1:compile service: ServiceOne
asm:asm:jar:3.3.1:compile service: ServiceTwo
asm:asm:jar:3.3.0:compile service: ServiceThree
@
hi.bye:beatles:jar:1.15:compile service: ServiceThree
hi.bye:beatles:jar:1.6:compile service: ServiceOne
hi.bye:beatles:jar:1.5:compile service: ServiceTwo
@

注意:输出中的服务名称不需要按任何特定顺序排列。只要版本从最大到最低排序。

逻辑上我认为我应该设置RS="@"并创建一个包含该块中每一行的数组,然后按版本号对这些数组进行排序并打印它们。问题是,我不知道如何按版本号对它们进行排序。以下是我目前在awk脚本中的内容:

BEGIN {
    RS = "@";
}
{
    split($0, lines, "\n");

    # sort the array by the version number from highest to lowest
    # <--- I need help here

    for(key in lines) { print lines[key]; }
    delete lines;
}
END {
}

如果这完全偏离基础,我愿意尝试新方法。任何有关此问题的帮助将不胜感激!

3 个答案:

答案 0 :(得分:4)

使用GNU排序版本排序:

$ awk -F':' -v OFS='\t' 'NF==1{c++} {print c+1, $4, $0}' file  | sort -k1n -k2rV | cut -f3-
foo.bar.baz:json:jar:2.2.10:compile service: ServiceThree
foo.bar.baz:json:jar:2.2.2:compile service: ServiceTwo
@
asm:asm:jar:3.3.1:compile service: ServiceTwo
asm:asm:jar:3.3.1:compile service: ServiceOne
asm:asm:jar:3.3.0:compile service: ServiceThree
@
hi.bye:beatles:jar:1.15:compile service: ServiceThree
hi.bye:beatles:jar:1.6:compile service: ServiceOne
hi.bye:beatles:jar:1.5:compile service: ServiceTwo
@

答案 1 :(得分:4)

这是另一个awk

$ awk '/^@/{close(cmd); print; next} 
           {cmd="sort -rV"; print | cmd}' file

foo.bar.baz:json:jar:2.2.10:compile service: ServiceThree
foo.bar.baz:json:jar:2.2.2:compile service: ServiceTwo
@
asm:asm:jar:3.3.1:compile service: ServiceTwo
asm:asm:jar:3.3.1:compile service: ServiceOne
asm:asm:jar:3.3.0:compile service: ServiceThree
@
hi.bye:beatles:jar:1.15:compile service: ServiceThree
hi.bye:beatles:jar:1.6:compile service: ServiceOne
hi.bye:beatles:jar:1.5:compile service: ServiceTwo
@

答案 2 :(得分:1)

使用GNU awk:

$ awk '
BEGIN {
    FS=":"
    PROCINFO["sorted_in"]="@ind_num_desc"  # for array processes order
}
$0=="@" {                                  # at the end of a block
    for(i in a)                            # order every array dimension
        for(j in a[i])
            for(k in a[i][j])
                for(l in a[i][j][k])
                    print a[i][j][k][l]    # output
     print "@"                             # block separator
     delete a                              # delete array 
     next                                  # skip to next block
}
{
     split($4,b,".")                       # separate version depths
     a[b[1]][b[2]][b[3]][--c]=$0           # hash to a
}' file
foo.bar.baz:json:jar:2.2.10:compile service: ServiceThree
foo.bar.baz:json:jar:2.2.2:compile service: ServiceTwo
@
asm:asm:jar:3.3.1:compile service: ServiceOne
asm:asm:jar:3.3.1:compile service: ServiceTwo
asm:asm:jar:3.3.0:compile service: ServiceThree
@
hi.bye:beatles:jar:1.15:compile service: ServiceThree
hi.bye:beatles:jar:1.6:compile service: ServiceOne
hi.bye:beatles:jar:1.5:compile service: ServiceTwo
@

本来应该是一个快速而美丽的公园散步,结果是讨厌的黑客。