来自卷曲输出的AWK时间,日期,状态和顺序

时间:2014-05-02 23:00:33

标签: bash curl awk

需要您使用以下awk语法的帮助。下面是我卷曲的输出,我需要稍微改进一下:

INPUT:

        RSYNCA-BACKUP
        RCYNCA 20140517 0021 2182097 2082097
        2014820905820917 10:03:54
        2014820905820917 10:37:43
        0:33:49


        RSYNCB-COPY
        20140517 0020 2082097 1982097 7 6 20
        2014820905820917 09:32:20
        2014820905820917 10:59:20
        1:27:00


        RSYNCC
        RCYNCE 20140517 0021 2182097 2082097
        2014820905820917 10:03:54
        2014820905820917 10:37:43
        0:33:49

        RSYNCD
        20140517 0020 2082097 1982097 7 6 20
        2014820905820917 09:32:20
        2014820905820917 10:59:20
        1:27:00

我使用AWK收到的输出:

RSYNCA-BACKUP|20140502|RCYNCA|10:02:15|10:56:42|0:54:27|FINISHED
RSYNCB-COPY|0022||15:31:06|        |0:06:04|INITIATED

Job Name|sequence|date|start time|end time|runtime|status

对于已启动状态的作业,没有结束时间,因此该字段可以为空

这就是我正在运行的东西并且搞砸了awk输出

awk -v RS='FINISHED|INITIATED' -v OFS='|' '$0 { print $1, $3, $2, $8, RS }'

RSYNCJOBNA|0021|20140502|2014820905820902|FINISHED|INITIATED
RSYNCJOBNA|0022|20140502|2014820905820902|FINISHED|INITIATED

我的curl输入有额外的空格我猜,这可能是个问题,这是一个真实的例子:

INITIATED
            RSYNCA
            20140502 0036 3682096 3582096 6 5
            2014820905820902 17:31:08
                0:17:16 ce eque
            INITIATED
            RSYNCA
            20140502 0035 3582096 3482096 6 5
            2014820905820902 17:01:10
                0:47:14 ce eque
            FINISHED
            RSYNCA
            20140502 0034 3482096 3382096 6 5
            2014820905820902 16:31:03
            2014820905820902 17:24:45
            0:53:42
            FINISHED
            RSYNCA
            20140502 0033 3382096 3282096 6 5
            2014820905820902 16:01:09
            2014820905820902 16:47:12
            0:46:03

2 个答案:

答案 0 :(得分:3)

curl "URL" |
    awk -v OFS='|' '/FINISHED|INITIATED/ {
        status = $1; getline;
        jobname = $1; getline;
        sequence = $2; date = $1; getline;
        start = $2; getline;
        if (status == "FINISHED") { end = $2; getline } else { end = "        " }
        runtime = $1;
        print jobname, sequence, date, start, end, runtime, status;
    }'

您输入的输出是:

RSYNCA|0036|20140502|17:31:08|        |0:17:16|INITIATED
RSYNCA|0035|20140502|17:01:10|        |0:47:14|INITIATED
RSYNCA|0034|20140502|16:31:03|17:24:45|0:53:42|FINISHED
RSYNCA|0033|20140502|16:01:09|16:47:12|0:46:03|FINISHED

答案 1 :(得分:3)

这是使用GNU AWK的一种方式。像:

一样运行
curl "$URL" | awk -f script.awk

script.awk的内容:

BEGIN {

    RS="FINISHED|INITIATED"
    OFS="|"
}

s {
    print ( \
        $1, \
        $3, \
        $2, \
        $9, \
        (s == "FINISHED" ? $11 : "        "), \
        ($NF ~ /:/ ? $NF : $(NF-2)), \
        s \
    )
}

{
    s = RT
}

结果:

RSYNCA|0036|20140502|17:31:08|        |0:17:16|INITIATED
RSYNCA|0035|20140502|17:01:10|        |0:47:14|INITIATED
RSYNCA|0034|20140502|16:31:03|17:24:45|0:53:42|FINISHED
RSYNCA|0033|20140502|16:01:09|16:47:12|0:46:03|FINISHED

或者,这里是单行:

curl "$URL" | awk 'BEGIN { RS="FINISHED|INITIATED"; OFS="|" } s { print $1, $3, $2, $9, (s == "FINISHED" ? $11 : "        "), ($NF ~ /:/ ? $NF : $(NF-2)), s } { s = RT }'