使用GAWK

时间:2017-11-30 00:59:24

标签: bash shell file parsing awk

我一直在尝试解析以下格式的ASCII文本文件 -

0 0 0x2de0 [0x98]: PERF_RECORD_MMAP -1/0: [0xffffffffc06ae000(0x5000) @ 0]: x /lib/modules/4.4.0-83-generic/kernel/net/ipv4/netfilter/nf_reject_ipv4.ko

0x2e78 [0x90]: event: 1
.
. ... raw event: size 144 bytes
.  0000:  01 00 00 00 01 00 90 00 ff ff ff ff 00 00 00 00  ................
.  0010:  00 30 6b c0 ff ff ff ff 00 50 00 00 00 00 00 00  .0k......P......
.  0020:  00 00 00 00 00 00 00 00 2f 6c 69 62 2f 6d 6f 64  ......../lib/mod
.  0030:  75 6c 65 73 2f 34 2e 34 2e 30 2d 38 33 2d 67 65  ules/4.4.0-83-ge
.  0040:  6e 65 72 69 63 2f 6b 65 72 6e 65 6c 2f 6e 65 74  neric/kernel/net
.  0050:  2f 69 70 76 34 2f 6e 65 74 66 69 6c 74 65 72 2f  /ipv4/netfilter/
.  0060:  69 70 74 5f 52 45 4a 45 43 54 2e 6b 6f 00 2e 6b  ipt_REJECT.ko..k
.  0070:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
.  0080:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................

0 0 0x2e78 [0x90]: PERF_RECORD_MMAP -1/0: [0xffffffffc06b3000(0x5000) @ 0]: x /lib/modules/4.4.0-83-generic/kernel/net/ipv4/netfilter/ipt_REJECT.ko

0x2f08 [0x88]: event: 1
.
. ... raw event: size 136 bytes
.  0000:  01 00 00 00 01 00 88 00 ff ff ff ff 00 00 00 00  ................
.  0010:  00 80 6b c0 ff ff ff ff 00 50 00 00 00 00 00 00  ..k......P......
.  0020:  00 00 00 00 00 00 00 00 2f 6c 69 62 2f 6d 6f 64  ......../lib/mod
.  0030:  75 6c 65 73 2f 34 2e 34 2e 30 2d 38 33 2d 67 65  ules/4.4.0-83-ge
.  0040:  6e 65 72 69 63 2f 6b 65 72 6e 65 6c 2f 6e 65 74  neric/kernel/net
.  0050:  2f 6e 65 74 66 69 6c 74 65 72 2f 78 74 5f 74 63  /netfilter/xt_tc
.  0060:  70 75 64 70 2e 6b 6f 00 00 00 00 00 00 00 00 00  pudp.ko.........
.  0070:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
.  0080:  00 00 00 00 00 00 00 00                      

    ........[some other data]........
0x11590 [0x30]: PERF_RECORD_AUXTRACE size: 0x2002a0  offset: 0  ref: 0x2d44e6441a3c2  idx: 0  tid: -1  cpu: 0
.
. ... Intel Processor Trace data: size 2097824 bytes
.  00000000:  02 82 02 82 02 82 02 82 02 82 02 82 02 82 02 82 PSB
.  00000010:  00 00 00                                        PAD
.  00000013:  99 20                                           MODE.TSX TXAbort:0 InTX:0
.  00000015:  99 01                                           MODE.Exec 64
.  00000017:  7d 08 45 06 81 ff ff 00                         FUP 0xffff81064508
.  0000001f:  00 00 00 00 00 00 00                            PAD
.  00000026:  02 43 00 76 49 1f 00 00                         PIP 0xfa4bb00 (NR=0)

.  0000002e:  00 00 00 00 00 00 00 00                         PAD
--- continued ---

该文件将包含多个标题 - 您可以在此处的代码段中看到。

PERF_RECORD_MMAPPERF_RECORD_AUXTRACE

文件中还会有其他标题。

我想要的是我的文本文件中只有PERF_RECORD_AUXTRACE的所有标题都应该被考虑。只应收集我文件中PERF_RECORD_AUXTRACE之后的所有数据(即所有以英特尔处理器跟踪数据开头的数据)。 PERF_RECORD_AUXTRACE标题还有一个大小字段,我可以使用该字段指定在PERF_RECORD_AUXTRACE标题中收集的数据量。

编辑#1

基本上,鉴于上面的输入文件片段,我希望输出为以下形式(包含PERF_RECORD_AUXTRACE的记录之后的所有行)...

.
. ... Intel Processor Trace data: size 2097824 bytes
.  00000000:  02 82 02 82 02 82 02 82 02 82 02 82 02 82 02 82 PSB
.  00000010:  00 00 00                                        PAD
.  00000013:  99 20                                           MODE.TSX TXAbort:0 InTX:0
.  00000015:  99 01                                           MODE.Exec 64
.  00000017:  7d 08 45 06 81 ff ff 00                         FUP 0xffff81064508
.  0000001f:  00 00 00 00 00 00 00                            PAD
.  00000026:  02 43 00 76 49 1f 00 00                         PIP 0xfa4bb00 (NR=0)

.  0000002e:  00 00 00 00 00 00 00 00                         PAD
--- continued ---

编辑#2 :这是我的另一项要求 -

如果我有如下的输入代码段 -

0 0 0x230 [0x60]: PERF_RECORD_MMAP -1/0: [0xffffffff81000000(0x3f000000) @ 0xffffffff81000000]: x [kernel.kallsyms]_text

0x290 [0x88]: event: 1
.
. ... raw event: size 136 bytes
.  0000:  01 00 00 00 01 00 88 00 ff ff ff ff 00 00 00 00  ................
.  0010:  00 00 00 c0 ff ff ff ff 00 90 00 00 00 00 00 00  ................
.  0020:  00 00 00 00 00 00 00 00 2f 6c 69 62 2f 6d 6f 64  ......../lib/mod
.  0030:  75 6c 65 73 2f 34 2e 34 2e 30 2d 38 33 2d 67 65  ules/4.4.0-83-ge
.  0040:  6e 65 72 69 63 2f 6b 65 72 6e 65 6c 2f 64 72 69  neric/kernel/dri
.  0050:  76 65 72 73 2f 61 74 61 2f 6c 69 62 61 68 63 69  vers/ata/libahci
.  0060:  2e 6b 6f 00 00 00 00 00 00 00 00 00 00 00 00 00  .ko.............
.  0070:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
.  0080:  00 00 00 00 00 00 00 00                          ........

0x11590 [0x30]: PERF_RECORD_AUXTRACE size: 0x2002a0  offset: 0  ref: 0x2d44e6441a3c2  idx: 0  tid: -1  cpu: 0
.
. ... Intel Processor Trace data: size 2097824 bytes
.  00000000:  02 82 02 82 02 82 02 82 02 82 02 82 02 82 02 82 PSB
.  00000010:  00 00 00                                        PAD
.  00000013:  99 20                                           MODE.TSX TXAbort:0 InTX:0
.  00000015:  99 01                                           MODE.Exec 64
.  00000017:  7d 08 45 06 81 ff ff 00                         FUP 0xffff81064508
.  0000001f:  00 00 00 00 00 00 00                            PAD
.  00000026:  02 43 00 76 49 1f 00 00                         PIP 0xfa4bb00 (NR=0)
.  0000002e:  00 00 00 00 00 00 00 00                         PAD
.  00000036:  02 c8 c2 3a 7c 00 00 00                         VMCS 0x7c3ac2

0 0 0x290 [0x88]: PERF_RECORD_MMAP -1/0: [0xffffffffc0000000(0x9000) @ 0]: x /lib/modules/4.4.0-83-generic/kernel/drivers/ata/libahci.ko

0x318 [0x98]: event: 1
.
. ... raw event: size 152 bytes
.  0000:  01 00 00 00 01 00 98 00 ff ff ff ff 00 00 00 00  ................
.  0010:  00 90 00 c0 ff ff ff ff 00 50 00 00 00 00 00 00  .........P......
.  0020:  00 00 00 00 00 00 00 00 2f 6c 69 62 2f 6d 6f 64  ......../lib/mod
.  0030:  75 6c 65 73 2f 34 2e 34 2e 30 2d 38 33 2d 67 65  ules/4.4.0-83-ge
.  0040:  6e 65 72 69 63 2f 6b 65 72 6e 65 6c 2f 64 72 69  neric/kernel/dri
.  0050:  76 65 72 73 2f 76 69 64 65 6f 2f 66 62 64 65 76  vers/video/fbdev
.  0060:  2f 63 6f 72 65 2f 66 62 5f 73 79 73 5f 66 6f 70  /core/fb_sys_fop
.  0070:  73 2e 6b 6f 00 00 00 00 00 00 00 00 00 00 00 00  s.ko............
.  0080:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
.  0090:  00 00 00 00 00 00 00 00                          ........


0x11590 [0x30]: PERF_RECORD_AUXTRACE size: 0x2002a0  offset: 0  ref: 0x2d44e6441a3c2  idx: 0  tid: -1  cpu: 0
.
. ... Intel Processor Trace data: size 2097824 bytes
.  00000000:  02 82 02 82 02 82 02 82 02 82 02 82 02 82 02 82 PSB
.  00000010:  00 00 00                                        PAD
.  00000013:  99 20                                           MODE.TSX TXAbort:0 InTX:0
.  00000015:  99 01                                           MODE.Exec 64
.  00000017:  7d 08 45 06 81 ff ff 00                         FUP 0xffff81064508
.  0000001f:  00 00 00 00 00 00 00                            PAD
.  00000026:  02 43 00 76 49 1f 00 00                         PIP 0xfa4bb00 (NR=0)
.  0000002e:  00 00 00 00 00 00 00 00                         PAD
.  00000036:  02 c8 c2 3a 7c 00 00 00                         VMCS 0x7c3ac2

我只需要包含PERF_RECORD_AUXTRACE的记录下的数据就像这样。如果包含

的第一行会很棒

英特尔处理器跟踪数据:大小2097824字节

也可以从我的输出中避免。

.  00000000:  02 82 02 82 02 82 02 82 02 82 02 82 02 82 02 82 PSB
.  00000010:  00 00 00                                        PAD
.  00000013:  99 20                                           MODE.TSX TXAbort:0 InTX:0
.  00000015:  99 01                                           MODE.Exec 64
.  00000017:  7d 08 45 06 81 ff ff 00                         FUP 0xffff81064508
.  0000001f:  00 00 00 00 00 00 00                            PAD
.  00000026:  02 43 00 76 49 1f 00 00                         PIP 0xfa4bb00 (NR=0)
.  0000002e:  00 00 00 00 00 00 00 00                         PAD
.  00000000:  02 82 02 82 02 82 02 82 02 82 02 82 02 82 02 82 PSB
.  00000010:  00 00 00                                        PAD
.  00000013:  99 20                                           MODE.TSX TXAbort:0 InTX:0
.  00000015:  99 01                                           MODE.Exec 64
.  00000017:  7d 08 45 06 81 ff ff 00                         FUP 0xffff81064508
.  0000001f:  00 00 00 00 00 00 00                            PAD
.  00000026:  02 43 00 76 49 1f 00 00                         PIP 0xfa4bb00 (NR=0)
.  0000002e:  00 00 00 00 00 00 00 00                         PAD

编辑#3 :这是我最初尝试做的事情......但显然不起作用!

cat "$file" | gawk -F' ' -- '
  /PERF_RECORD_AUXTRACE / {
    offset = strtonum($1)
    hsize  = strtonum(substr($2, 2))
    size   = strtonum($5)
    idx    = strtonum($11)
    ext    = ""


    ofile = sprintf("raw-pt.txt")
    begin = offset + hsize

    cmd = sprintf("dd if=%s of=%s conv=notrunc oflag=append ibs=1 " \
                  "count=%d status=none", file, ofile, size)

    #!cmd = sprintf("sed p")
    if (dry_run != 0) {
      print cmd
    }
    else {
     system(cmd)
    }
  }

我不太确定如何正确解析此文件以获得我想要的内容。我也不确定使用Python会有所帮助。

如何解决这个问题?

1 个答案:

答案 0 :(得分:0)

要从您发布的输入中获得您想要的输出,只需:

awk 'f; /PERF_RECORD_AUXTRACE/{f=1}' file

如果您实际上并不是这样,那么请编辑您的问题以澄清您的要求,并提供不同的样本输入/输出,以便在必要时更真实地展示您的问题。