使用Awk或sort进行条件排序

时间:2013-10-06 06:00:06

标签: bash shell sorting awk

好吧,所以我asked a question大约一周左右,关于如何使用sed或awk在两个空白行之间提取文本块,以及省略部分提取的文本。我得到的答案几乎满足了我的需求,但现在我正在做一些额外的乐趣(为了OCD的缘故)。

我想在这一轮中对awk的输出进行排序。我找到了this question & answer,但它并没有帮我解决问题。我也尝试过围绕很多awk文档,试着弄清楚我是怎么做到的,无济于事。

所以这是我的脚本中的代码块,它可以完成所有繁琐的工作:

# This block of stuff fetches the nameservers as reported by the registrar and DNS zone
# Then it gets piped into awk to work some more formatting magic...
# The following is a step-for-step description since I can't put comments inside the awk block:
# BEGIN:
#     Set the record separator to a blank line
#     Set the input/output field separators to newlines
# FNR == 3:
#     The third block of dig's output is the nameservers reported by the registrar
#     Also blanks the last field & strips it since it's just a useless dig comment
dig +trace +additional $host | \
awk -v host="$host" '
    BEGIN {
        RS = "";
        FS = "\n"
    }
    FNR == 3 {
        print "Nameservers of",host,"reported by the registrar:";
        OFS = "\n";
        $NF = ""; sub( /[[:space:]]+$/, "" );
        print
    }
'

如果我将google.com作为$host的值传递(其他主机名可能产生不同行数的输出),那么这是输出:

Nameservers of google.com reported by the registrar:
google.com.         172800  IN  NS  ns2.google.com.
google.com.         172800  IN  NS  ns1.google.com.
google.com.         172800  IN  NS  ns3.google.com.
google.com.         172800  IN  NS  ns4.google.com.
ns2.google.com.         172800  IN  A   216.239.34.10
ns1.google.com.         172800  IN  A   216.239.32.10
ns3.google.com.         172800  IN  A   216.239.36.10
ns4.google.com.         172800  IN  A   216.239.38.10

这个想法是,使用现有的awk块,或者将awk的输出组合成更多awk,sort或者其他任何东西的组合,使用条件算法对该块文本进行排序:

if ( column 4 == 'NS' )
    sort by column 5
else // This will ensure that the col 1 sort includes A and AAAA records
    sort by column 1

我几乎得到了与上一个问题相同的答案偏好:

        
  1. 最重要的是,它必须是可移植的,因为我在使用sed时遇到OS X(我的家庭系统)和Fedora(我在工作中使用的)之间的不同行为(必须用OS X上的gsed替换它)和grep的-m标志(在另一个脚本中使用)
  2.     
  3. 非常感谢解决方案如何运作,作为一个学习机会,而不是其他任何东西。我已经从上一个问题中已经提供的awk解决方案中学到了很多东西。
  4.     
  5. 如果解决方案可以在同一个awk块中实现,那也很棒
  6.     
  7. 如果没有,那么简单而有说服力的东西,我可以通过管道awk的输出就足够了

2 个答案:

答案 0 :(得分:1)

这是基于@ shellter的想法的解决方案。将名称服务器记录的输出传递给:

awk '$4 == "NS" {print $1, $5, $0} $4 == "A" {print $1, $1, $0}' | sort | cut -f3- -d' '

说明:

  • 使用awk,我们只会使用NSA条记录,并使用前缀重新打印相同的行:主搜索列+辅助搜索列
  • sort会对行进行排序,这要归功于我们设置第一列和第二列的方式,顺序应该是您想要的
  • 使用cut我们摆脱了用于排序的前缀

答案 1 :(得分:0)

我知道你问过awk解决方案,但是既然你用bash标记了它,我想我会提供这样的版本。它也应该比awk;)

更便携
# the whole line
declare -a lines
# the key to use for sorting
declare -a keys

# insert into the arrays at the appropriate position
function insert
{
    local key="$1"
    local line="$2"
    local count=${#lines[*]}
    local i
    # go from the end backwards
    for((i=count; i>0; i-=1))
    do
        # if we have the insertion point, break
        [[ "${keys[i-1]}" > "$key" ]] || break
        # shift the current item to make room for the new one
        lines[i]=${lines[i-1]}
        keys[i]=${keys[i-1]}
    done
    # insert the new item
    lines[i]=$line
    keys[i]=$key
}

# This block of stuff fetches the nameservers as reported by the registrar and DNS zone
#     The third block of dig's output is the nameservers reported by the registrar
#     Also blanks the last field & strips it since it's just a useless dig comment
block=0
dig +trace +additional $host |
while read f1 f2 f3 f4 f5
do
    # empty line begins new block
    if [ -z "$f1" ]
    then
        # increment block counter
        block=$((block+1))
        # and read next line
        continue
    fi

    # if we are not in block #3, read next line
    [[ $block == 3 ]] || continue

    # ;; ends the block
    if [[ "$f1" == ";;" ]]
    then
        echo "Nameservers of $host reported by the registrar:"
        # print the lines collected so far
        for((i=0; i<${#lines[*]}; i+=1))
        do
            echo ${lines[i]}
        done
        # don't bother reading the rest
        break
    fi

    # figure out what key to use for sorting
    if [[ "$f4" == "NS" ]]
    then
        key=$f5
    else
        key=$f1
    fi
    # add the line to the arrays
    insert "$key" "$f1 $f2 $f3 $f4 $f5"
done