从awk脚本打印文本块到文件[banner like]

时间:2014-07-06 13:47:59

标签: file redirect awk message

我有awk脚本进行一些处理并将其输出发送到文件。 我如何在awk程序的BEGIN块中写出类似横幅的消息 首先是那个文件,比如bash heredoc

我知道我可以使用多个print命令,但是有一些方法可以使用 一个print命令,但保留带有换行符的多行文本等。

所以输出应该是这样的:

#########################################
#      generated by some author         #
#        ENVIRON["VAR"]
#########################################

格式良好的其他问题是ENVIRON["VAR"]应该是。{1}} 在字符串中间扩展。

3 个答案:

答案 0 :(得分:3)

简单的方法是使用heredoc并将其保存在awk变量中:

VAR="whatever"
awk -v var="\
#########################################
#      generated by some author         #
#        $VAR
#########################################" '
BEGIN{ print var }
'
#########################################
#      generated by some author         #
#        whatever
#########################################

或者,这可能比你想要的更多,但是下面是我用来提供比awk中的文档更好一些东西的命令。在将模板文本添加到多个文件时,我发现它绝对无价。

它是一个shell脚本,它采用带有稍微扩展语法的awk脚本(以便于此处文档)作为输入,调用gawk将该扩展语法转换为普通的awk print语句,然后再次调用gawk来执行结果脚本。

我称之为" epawk"用于"扩展打印" awk以及后面的内容是该工具以及如何使用它的几个示例。当您调用它而不是直接调用awk时,您可以编写脚本,其中包含用于打印的预格式化文本块,就像您希望使用here-doc(每个#之前的空格是制表符一样):

$ export VAR="whatever"
$ epawk 'BEGIN {
    print <<-!
        #########################################
        #      generated by some author         #
        #        "ENVIRON["VAR"]"
        #########################################
    !
}'
#########################################
#      generated by some author         #
#        whatever
#########################################

它的工作原理是从awk脚本创建一个awk脚本,然后执行它。如果您只想查看正在生成的脚本,epawk将打印生成的脚本,而不是执行它,如果您给它-X参数,例如:

$ epawk -X 'BEGIN {
    print <<-!
        #########################################
        #      generated by some author         #
        #        "ENVIRON["VAR"]"
        #########################################
    !
}'
BEGIN {
print "#########################################"
print "#      generated by some author         #"
print "#        "ENVIRON["VAR"]""
print "#########################################"
}

THE SCRIPT:

#!/bin/bash
# The above must be the first line of this script as bash or zsh is
# required for the shell array reference syntax used in this script.

##########################################################
# Extended Print AWK
#
# Allows printing of pre-formatted blocks of multi-line text in awk scripts.
#
# Before invoking the tool, do the following IN ORDER:
#
# 1) Start each block of pre-formatted text in your script with
#       print << TERMINATOR
#    on it's own line and end it with 
#   TERMINATOR
#    on it's own line. TERMINATOR can be any sequence of non-blank characters
#    you like. Spaces are allowed around the symbols but are not required.
#    If << is followed by -, e.g.:
#       print <<- TERMINATOR
#    then all leading tabs are removed from the block of pre-formatted
#    text (just like shell here documents), if it's followed by + instead, e.g.:
#       print <<+ TERMINATOR
#    then however many leading tabs are common across all non-blank lines
#    in the current pre-formatted block are removed.
#    If << is followed by =, e.g.
#       print <<= TERMINATOR
#    then whatever leading white space (tabs or blanks) occurs before the
#    "print" command will be removed from all non-blank lines in
#    the current pre-formatted block.
#    By default no leading spaces are removed. Anything you place after
#    the TERMINATOR will be reproduced as-is after every line in the
#    post-processed script, so this for example:
#   print << HERE |"cat>&2"
#       foo
#   HERE
#    would cause "foo" to be printed to stderr.
#
# 2) Within each block of pre-formatted text only:
#   a) Put a backslash character before every backslash (\ -> \\).
#   b) Put a backslash character before every double quote (" -> \").
#   c) Enclose awk variables in double quotes without leading
#      backslashes (awkVar -> "awkVar").
#   d) Enclose awk record and field references ($0, $1, $2, etc.)
#      in double quotes without leading backslashes ($1 -> "$1").
#
# 3) If the script is specified on the command line instead of via
#    "-f script" then replace all single quote characters (') in or out
#    of the pre-formatted blocks with their ANSI octal escape sequence (\047)
#    or the sequence '\'' (tick backslash tick tick). This is normal and is
#    required because command-line awk scripts cannot contain single quote
#    characters as those delimit the script. Do not use hex \x27, see
#    http://awk.freeshell.org/PrintASingleQuote.
#
# Then just use it like you would gawk with the small caveat that only
# "-W <option>", not "--<option>", is supported for long options so you
# can use "-W re-interval" but not "--re-interval" for example.
#
# To just see the post-processed script and not execute it, call this
# script with the "-X" option.
#
# See the bottom of this file for usage examples.
##########################################################

expand_prints() {

    gawk '

        !inBlock {
        if ( match($0,/^[[:blank:]]*print[[:blank:]]*<</) ) {

        # save any blanks before the print in case 
        # skipType "=" is used.
        leadBlanks = $0
        sub(/[^[:blank:]].*$/,"",leadBlanks)

        $0 = substr($0,RSTART+RLENGTH)

            if      ( sub(/^[-]/,"") )  { skipType = "-" }
            else if ( sub(/^[+]/,"") )  { skipType = "+" }
            else if ( sub(/^[=]/,"") )  { skipType = "=" }
            else                { skipType = ""  }

            gsub(/(^[[:blank:]]+|[[:blank:]]+$)/,"")

            if (/[[:blank:]]/) {
                terminator = $0
                    sub(/[[:blank:]].*/,"",terminator)

            postprint = $0
                sub(/[^[:blank:]]+[[:blank:]]+/,"",postprint)
            }
            else {
                terminator = $0
            postprint = ""
            }

            startBlock()

            next
        }
        }

        inBlock {

        stripped=$0
        gsub(/(^[[:blank:]]+|[[:blank:]]+$)/,"",stripped)

        if ( stripped"" == terminator"" ) {
            endBlock()
        }
        else {
            updBlock()
        }

        next
        }

        { print }

    function startBlock() { inBlock=1; numLines=0  }

    function updBlock()   { block[++numLines] = $0 }

    function endBlock(  i,numSkip,indent) {

        if (skipType == "") {
        # do not skip any leading tabs
        indent = ""
        }
        else if (skipType == "-") {
        # skip all leading tabs
        indent = "[\t]+"
        }
        else if (skipType == "+") {

        # skip however many leading tabs are common across
        # all non-blank lines in the current pre-formatted block

            for (i=1;i<=numLines;i++) {

            if (block[i] ~ /[^[:blank:]]/) {

                match(block[i],/^[\t]+/)

                if ( (numSkip == "") || (numSkip > RLENGTH) ) {
                numSkip = RLENGTH
                }
            }
            }

            for (i=1;i<=numSkip;i++) {
            indent = indent "\t"
            }
        }
        else if (skipType == "=") {
        # skip whatever pattern of blanks existed
        # before the "print" statement
        indent = leadBlanks
        }


        for (i=1;i<=numLines;i++) {
                sub(indent,"",block[i])
        print "print \"" block[i] "\"\t" postprint
        }

        inBlock=0
    }

    ' "$@"

}

unset awkArgs
unset scriptFiles
expandOnly=0
while getopts "v:F:W:f:X" arg
do
        case $arg in
    f ) scriptFiles+=( "$OPTARG" ) ;;
        [vFW] ) awkArgs+=( "-$arg" "$OPTARG" ) ;;
    X ) expandOnly=1 ;;
        * )     exit 1 ;;
        esac
done
shift $(( OPTIND - 1 ))

if [ -z "${scriptFiles[*]}" -a "$#" -gt "0" ]
then
    # The script cannot contain literal 's because in cases like this:
    #   'BEGIN{ ...abc'def... }'
    # the args parsed here (and later again by gawk) would be:
    #   $1 = BEGIN{ ...abc
    #   $2 = def... }
    # Replace 's with \047 or '\'' if you need them:
    #   'BEGIN{ ...abc\047def... }'
    #   'BEGIN{ ...abc'\''def... }'
    scriptText="$1"
    shift
fi

# Remaining symbols in "$@" must be data file names and/or variable
# assignments that do not use the "-v name=value" syntax.

if [ -n "${scriptFiles[*]}" ]
then
    if (( expandOnly == 1 ))
    then
    expand_prints "${scriptFiles[@]}"
    else
    gawk "${awkArgs[@]}" "$(expand_prints "${scriptFiles[@]}")" "$@"
    fi

elif [ -n "$scriptText" ]
then
    if (( expandOnly == 1 ))
    then
    printf '%s\n' "$scriptText" | expand_prints
    else
    gawk "${awkArgs[@]}" "$(printf '%s\n' "$scriptText" | expand_prints)" "$@"
    fi
else
    printf '%s: ERROR: no awk script specified.\n' "$toolName" >&2
    exit 1
fi

用法示例:

$ cat data.txt
abc def"ghi

#######
$ cat script.awk
{
    awkVar="bar" 

    print "----------------"

    print << HERE
    backslash: \\

        quoted text: \"text\"

    single quote as ANSI sequence: \047

    literal single quote (ONLY works when script is in a file): '

    awk variable: "awkVar"

    awk field: "$2"
    HERE

    print "----------------"

    print <<-!
        backslash: \\

            quoted text: \"text\"

        single quote as ANSI sequence: \047

        literal single quote (ONLY works when script is in a file): '

        awk variable: "awkVar"

        awk field: "$2"
    !

    print "----------------"

    print <<+           whatever
        backslash: \\

    quoted text: \"text\"

        single quote as ANSI sequence: \047

        literal single quote (ONLY works when script is in a file): '

        awk variable: "awkVar"

        awk field: "$2"
    whatever

    print "----------------"
}

$ epawk -f script.awk data.txt
----------------
    backslash: \

        quoted text: "text"

    single quote as ANSI sequence: '

    literal single quote (ONLY works when script is in a file): '

    awk variable: bar

    awk field: def"ghi
----------------
backslash: \

    quoted text: "text"

single quote as ANSI sequence: '

literal single quote (ONLY works when script is in a file): '

awk variable: bar

awk field: def"ghi
----------------
    backslash: \

quoted text: "text"

    single quote as ANSI sequence: '

    literal single quote (ONLY works when script is in a file): '

    awk variable: bar

    awk field: def"ghi
----------------

$ epawk -F\" '{
print <<!
    ANSI-tick-surrounded quote-separated field 2 (will work): \047"$2"\047
!
}' data.txt
    ANSI-tick-surrounded quote-separated field 2 (will work): 'ghi'

epawk -F\" '{
print <<!
    Shell-escaped-tick-surrounded quote-separated field 2 (will work): '\''"$2"'\''
    "
}' data.txt
    Shell-escaped-tick-surrounded quote-separated field 2 (will work): 'ghi'

$ epawk -F\" '{
print <<!
    Literal-tick-surrounded quote-separated field 2 (will not work): '"$2"'
!
}' data.txt
    Literal-tick-surrounded quote-separated field 2 (will not work): 

$ epawk -X 'BEGIN{
print <<!
    foo
    bar
!
}'
BEGIN{
print "    foo"
print "    bar"
}

$ cat file
a
b
c

$ epawk '{
    print <<+! |"cat>o2"
        numLines="NR"
                numFields="NF", $0="$0", $1="$1"
    !
}' file

$ cat o2
numLines=1
        numFields=1, $0=a, $1=a
numLines=2
        numFields=1, $0=b, $1=b
numLines=3
        numFields=1, $0=c, $1=c

$ epawk 'BEGIN{

    cmd = "sort"
    print <<+! |& cmd
        d
        b
        a
        c
    !
    close(cmd, "to")

    while ( (cmd |& getline line) > 0 ) {
        print "got:", line
    }
    close(cmd)

}' file
got: a
got: b
got: c
got: d

答案 1 :(得分:1)

$ cat a.awk
BEGIN {
    print "\
#########################################\n\
#      generated by some author         #\n\
#########################################"
}
$ awk -f a.awk
#########################################
#      generated by some author         #
#########################################

答案 2 :(得分:0)

这是你要找的吗?

var="Peter Hanson"

awk -v auth="$var" '
BEGIN {print "#########################################"
    print "#      generated by some author         #"
    printf "#";
    l=int((41-length(auth))/2)
    r=((41-length(auth))/2-l)*2
        for (i=1;i<=l;i++) 
        printf " "
    printf "%s",auth
    for (i=1;i<=l+r-2;i++) 
        printf " "
    print "#"
    print "#########################################"
    }' file
#########################################
#      generated by some author         #
#              Peter Hanson             #
#########################################

这将获取变量var中的数据并将其作为第二行打印 它确实调整了字段,因此它居中 在最后print

之后,您需要输入其余的代码