我在解析多行文件时遇到问题。我试过awk
,但我只知道如何使用单行。
文件包含如下记录:
0123456789ab "(channel
(1
(saturation(14))
)
(2
(saturation(41))
)
(3
(saturation(25))
)
(4
(saturation(27))
)
(5
(saturation(33))
(ssid
(0
(ssid(TestingAlpha))
(rssi(5))
)
)
)
(6
(saturation(100))
(ssid
(0
(ssid(TestingBravo))
(rssi(70))
)
(1
(ssid(TestingCharlie))
(rssi(44))
)
)
)
(7
(saturation(40))
)
(8
(saturation(22))
)
(9
(saturation(19))
)
(10
(saturation(20))
)
(11
(saturation(11))
)
(12
)
(13
(saturation(11))
)
)
"
这是一项无线调查。任何可以分析的输出(数据库记录,excel列等)都是可以接受的。
答案 0 :(得分:1)
正如我在评论中所说,Awk在处理表格数据方面非常有效。在这里,您有一种树数据结构。 Awk不合适。
然而,由于格式似乎稳定,我们可以作弊:
BEGIN {
OFS=";"
print "bizid", "ssid", "channel", "saturation", "rssi"
}
NR == 1 {
split($1,A," ")
bizid=A[1]
next
}
{
level = length($1) / 2
}
function clearv(v, R) {
split(v,R,")")
return R[1]
}
level == 1 {
channel=$2
next
}
level == 2 && $2 == "saturation" {
saturation=clearv($3)
next
}
level == 4 && $2 == "ssid" {
ssid=clearv($3)
next
}
level == 4 && $2 == "rssi" {
print bizid, ssid, channel, saturation, clearv($3)
next
}
产生:
bizid;ssid;channel;saturation;rssi
0123456789ab;TestingAlpha;5;33;5
0123456789ab;TestingBravo;6;100;70
0123456789ab;TestingCharlie;6;100;44
似乎可以接受分析。
答案 1 :(得分:1)
它需要调试(不是我!)但是这里是如何处理问题的:编写一个递归函数,每次命中时都会下降"(",构建一个数组由当时调用的当前深度索引,并在您点击最后一个")时打印该数组内容"在字符串中:
$ cat tst.awk
BEGIN { RS="[)]\\s*\"\\s*" }
function descend(tail) {
if ( ++depth == 30 ) {
print "ERROR: went too deep" | "cat>&2"
exit 1
}
while ( match(tail,/([^()]+)([()])(.*)/,a) ) {
val[depth] = gensub(/^\s+|\s+$/,"","g",a[1])
if ( a[2] == "(" ) {
descend(a[3])
}
else {
for (i=1; i<=depth; i++) {
printf "%s,", val[i]
}
print ""
}
tail = a[3]
}
--depth
}
{ sub(/^[^"]+"[(]/,""); descend($0) }
$ awk -f tst.awk file
channel,1,saturation,14,
channel,1,saturation,,
channel,1,saturation,,2,saturation,41,
channel,1,saturation,,2,saturation,,
channel,1,saturation,,2,saturation,,3,saturation,25,
channel,1,saturation,,2,saturation,,3,saturation,,
channel,1,saturation,,2,saturation,,3,saturation,,4,saturation,27,
channel,1,saturation,,2,saturation,,3,saturation,,4,saturation,,
channel,1,saturation,,2,saturation,,3,saturation,,4,saturation,,5,saturation,33,
channel,1,saturation,,2,saturation,,3,saturation,,4,saturation,,5,saturation,,ssid,0,ssid,TestingAlpha,
channel,1,saturation,,2,saturation,,3,saturation,,4,saturation,,5,saturation,,ssid,0,ssid,,rssi,5,
channel,1,saturation,,2,saturation,,3,saturation,,4,saturation,,5,saturation,,ssid,0,ssid,,rssi,,
channel,1,saturation,,2,saturation,,3,saturation,,4,saturation,,5,saturation,,ssid,0,ssid,,rssi,,
channel,1,saturation,,2,saturation,,3,saturation,,4,saturation,,5,saturation,,ssid,0,ssid,,rssi,,
channel,1,saturation,,2,saturation,,3,saturation,,4,saturation,,5,saturation,,ssid,0,ssid,,rssi,,6,saturation,100,
channel,1,saturation,,2,saturation,,3,saturation,,4,saturation,,5,saturation,,ssid,0,ssid,,rssi,,6,saturation,,ssid,0,ssid,TestingBravo,
ERROR: went too deep
以上使用GNU awk进行多字符RS和gensub()。
我确实喜欢将它转换为JSON然后在其上使用jq
的想法,但不是我对JSON或jq足够熟悉的东西。