bash中的多行解析

时间:2017-12-15 11:30:38

标签: bash awk

我在解析多行文件时遇到问题。我试过awk,但我只知道如何使用单行。

文件包含如下记录:

0123456789ab    "(channel
  (1
    (saturation(14))
  )
  (2
    (saturation(41))
  )
  (3
    (saturation(25))
  )
  (4
    (saturation(27))
  )
  (5
    (saturation(33))
    (ssid
      (0
        (ssid(TestingAlpha))
        (rssi(5))
      )
    )
  )
  (6
    (saturation(100))
    (ssid
      (0
        (ssid(TestingBravo))
        (rssi(70))
      )
      (1
        (ssid(TestingCharlie))
        (rssi(44))
      )
    )
  )
  (7
    (saturation(40))
  )
  (8
    (saturation(22))
  )
  (9
    (saturation(19))
  )
  (10
    (saturation(20))
  )
  (11
    (saturation(11))
  )
  (12
  )
  (13
    (saturation(11))
  )
)
"

这是一项无线调查。任何可以分析的输出(数据库记录,excel列等)都是可以接受的。

2 个答案:

答案 0 :(得分:1)

正如我在评论中所说,Awk在处理表格数据方面非常有效。在这里,您有一种树数据结构。 Awk不合适。

然而,由于格式似乎稳定,我们可以作弊:

BEGIN {
    OFS=";"
    print "bizid", "ssid", "channel", "saturation", "rssi"
}

NR == 1 {
    split($1,A," ")
    bizid=A[1]
    next
}

{
    level = length($1) / 2
}

function clearv(v,      R) {
    split(v,R,")")
    return R[1]
}

level == 1 {
    channel=$2
    next
}

level == 2 && $2 == "saturation" {
    saturation=clearv($3)
    next
}

level == 4 && $2 == "ssid" {
    ssid=clearv($3)
    next
}

level == 4 && $2 == "rssi" {
    print bizid, ssid, channel, saturation, clearv($3)
    next
}

产生:

bizid;ssid;channel;saturation;rssi
0123456789ab;TestingAlpha;5;33;5
0123456789ab;TestingBravo;6;100;70
0123456789ab;TestingCharlie;6;100;44

似乎可以接受分析。

答案 1 :(得分:1)

它需要调试(不是我!)但是这里是如何处理问题的:编写一个递归函数,每次命中时都会下降"(",构建一个数组由当时调用的当前深度索引,并在您点击最后一个")时打印该数组内容"在字符串中:

$ cat tst.awk
BEGIN { RS="[)]\\s*\"\\s*" }
function descend(tail) {
    if ( ++depth == 30 ) {
        print "ERROR: went too deep" | "cat>&2"
        exit 1
    }
    while ( match(tail,/([^()]+)([()])(.*)/,a) ) {
        val[depth] = gensub(/^\s+|\s+$/,"","g",a[1])
        if ( a[2] == "(" ) {
            descend(a[3])
        }
        else {
            for (i=1; i<=depth; i++) {
                printf "%s,", val[i]
            }
            print ""
        }
        tail = a[3]
    }
    --depth
}
{ sub(/^[^"]+"[(]/,""); descend($0) }

$ awk -f tst.awk file
channel,1,saturation,14,
channel,1,saturation,,
channel,1,saturation,,2,saturation,41,
channel,1,saturation,,2,saturation,,
channel,1,saturation,,2,saturation,,3,saturation,25,
channel,1,saturation,,2,saturation,,3,saturation,,
channel,1,saturation,,2,saturation,,3,saturation,,4,saturation,27,
channel,1,saturation,,2,saturation,,3,saturation,,4,saturation,,
channel,1,saturation,,2,saturation,,3,saturation,,4,saturation,,5,saturation,33,
channel,1,saturation,,2,saturation,,3,saturation,,4,saturation,,5,saturation,,ssid,0,ssid,TestingAlpha,
channel,1,saturation,,2,saturation,,3,saturation,,4,saturation,,5,saturation,,ssid,0,ssid,,rssi,5,
channel,1,saturation,,2,saturation,,3,saturation,,4,saturation,,5,saturation,,ssid,0,ssid,,rssi,,
channel,1,saturation,,2,saturation,,3,saturation,,4,saturation,,5,saturation,,ssid,0,ssid,,rssi,,
channel,1,saturation,,2,saturation,,3,saturation,,4,saturation,,5,saturation,,ssid,0,ssid,,rssi,,
channel,1,saturation,,2,saturation,,3,saturation,,4,saturation,,5,saturation,,ssid,0,ssid,,rssi,,6,saturation,100,
channel,1,saturation,,2,saturation,,3,saturation,,4,saturation,,5,saturation,,ssid,0,ssid,,rssi,,6,saturation,,ssid,0,ssid,TestingBravo,
ERROR: went too deep

以上使用GNU awk进行多字符RS和gensub()。

我确实喜欢将它转换为JSON然后在其上使用jq的想法,但不是我对JSON或jq足够熟悉的东西。