基于分隔符的合并记录 - Linux

时间:2014-12-10 13:01:01

标签: linux awk sed

需要解析文件并根据以下标准创建新文件:

这是示例源数据

TTTTT001:866:           $READ #R1         FILE (TEST-ACCOUNTS) LOGICAL               00085100
TTTTT001-867-            USING DESCRIPTOR (COMM-ACCOUNTS)                            00085200
TTTTT001-868-            STARTING FROM COMM-ACCOUNTS = WS-ACCT-KEY                   00085300
TTTTT001-869-            RECORD (TEST-ACCOUNTS) RELEASE(NO).                         00085400
TTTTT001-870-                                                                        00085500
TTTTT001-871-           IF NOT (LAST-RESPONSE-CODE = 0 OR 3)                         00085600
TTTTT001-872-             MOVE 122 TO ERROR-ABEND-CODE                               00085700
TTTTT001-873-             PERFORM ZT-ERROR.                                          00085800
--
TTTTT001:1018:           $READ #R3         FILE (TEST-ACCOUNTS)                       00100300
TTTTT001-1019-                              ISN (R1-ISN)                              00100400
TTTTT001-1020-                           RECORD (TEST-ACCOUNTS)                       00100500
TTTTT001-1021-                          RELEASE (NO) HOLD.                            00100600
TTTTT001-1022-                                                                        00100700
TTTTT001-1023-           IF LAST-RESPONSE-CODE NOT = 0                                00100800
TTTTT001-1024-              OR R3-ISN NOT = R1-ISN                                    00100900
TTTTT001-1025-             MOVE 122 TO ERROR-ABEND-CODE                               00101000
--

希望获得以下内容

  1. 每个部分由 - (grep的结果)
  2. 标识
  3. 删除每行末尾显示的数字
  4. 删除后的行。直到 -
  5. 合并从$ READ开始直到的文本。分成一行
  6. 所以输出应该是这样的

    TTTTT001:866:           $READ #R1         FILE (TEST-ACCOUNTS) LOGICAL USING DESCRIPTOR (COMM-ACCOUNTS) STARTING FROM COMM-ACCOUNTS = WS-ACCT-KEY RECORD (TEST-ACCOUNTS) RELEASE(NO).
    --
    TTTTT001:1018:           $READ #R3         FILE (TEST-ACCOUNTS) ISN (R1-ISN) RECORD (TEST-ACCOUNTS) RELEASE (NO) HOLD.                    
    --
    

2 个答案:

答案 0 :(得分:1)

您可以尝试使用此sed命令并获得准确的结果。

sed -nr '/[T]+[0-9]+:/{:a;N;/\n--/{p;d;t};/\n.*\./{s/(.*) .*\n[^ ]+-(.*) .*/\1 \2/g;s/ +/ /g;p;d;t};s/(.*) .*\n[^ ]+-(.*) .*/\1 \2/g;s/ +/ /g; t a;}' FileName

答案 1 :(得分:1)

将以下代码保存为sed.in,然后保存为sed -nr -f sed.in filename。它适用于我在Ubuntu14.04上使用GNU sed版本4.2.2的示例输入。

如果您想根据自己的需要了解和修改此脚本,我已添加评论。如果您想了解如何使用sed,请查看this

s/[ \t]*[0-9]*$//;h         # remove trailing number and space, then save it to hold space
:back                       # label
n                           # next line
s/[ \t]*[0-9]*$//           # remove trailing number and space
s/TTTTT[^ ]*[ \t]+(.*$)/\1/ # remove heading and spaces
/\.$/ !{                    # if it's not the line end with '.', append it to hold space
    H
    bback                   # jump to label back
}
/\.$/ {                     # line end with '.'
    H                       # append
    :aa                     # label
    n                       # get next line
    /^--/ !baa              # if it's not line start with '--' skip it
    /^--/ {
        x                   # swap hold space and pattern space
        s/\n/ /g;p          # substitute \n with space and print it
        s/^.*$//            # empty pattern space
        x                   # swap space
        b                   # get into next cycle
    }
}