AWK:如何在同一行中的两个关键字之间精确匹配和打印多个单词

时间:2015-12-13 10:51:26

标签: regex bash awk

考虑一个名为" nett"的文本文件。具有以下内容:

admin@(none):/tmp/home/root# cat nett
BSSID: 00:22:07:29:D4:23 RSSI: -71 dBm Band: 2.4GHz Channel: 1 802.11: b/g/n SSID: Inteno_24  noise: -70
BSSID: 00:19:77:12:97:94 RSSI: -54 dBm Band: 2.4GHz Channel: 1 802.11: b/g/n SSID: AK-Gjester  noise: -70
BSSID: 00:19:77:12:97:95 RSSI: -55 dBm Band: 2.4GHz Channel: 1 802.11: b/g/n SSID: AK-Ansatt  noise: -70
BSSID: 02:26:16:B2:37:AD RSSI: -73 dBm Band: 2.4GHz Channel: 6 802.11: b/g SSID: Trimble Service (5132555899)  noise: -87
BSSID: FA:8F:CA:88:F9:8E RSSI: -45 dBm Band: 2.4GHz Channel: 6 802.11: b/g/n SSID: Chromecast6286  noise: -87
BSSID: 00:22:07:3F:67:6B RSSI: -86 dBm Band: 2.4GHz Channel: 13 802.11: b/g/n SSID: Inteno-676C  noise: -87

我正在尝试使用awk将此文件中的格式化数据打印到终端。这是一个更长的脚本的一部分。以下脚本说明了我需要解决的问题:

#!/bin/sh
awk ' \
    {for(i=1;i<=NF;i++)if($i~/SSID:/)printf "%s%s", "BSSID: " $(i+1)} \
    {for(i=1;i<=NF;i++)if($i~/Channel:/)printf "%s%s\n", "Kanal: " $(i+1)}' nett

第二行与频道:正常工作,awk循环通过一行一个字一次搜索&#34;频道:&#34;单词然后用一些自定义文本打印下一个单词。一行中可能存在可变数量的列,因此在这种情况下,定位特定列并不总是有效。

然而,真正的问题是第一行。这里有两个问题需要解决:

1:既然有一个单词&#34; BSSID&#34;和&#34; SSID&#34;那么搜索模式需要准确。目前两者都是&#34; BSSID&#34;和&#34; SSID&#34;匹配。

2:我需要打印的文本在这种情况下可能不止一个单词,如第四行所示:

BSSID: 02:26:16:B2:37:AD RSSI: -73 dBm Band: 2.4GHz Channel: 6 802.11: b/g SSID: Trimble Service (5132555899)  noise: -87

这里我需要awk在SSID:和noise之间找到多个单词:并打印所有单词。

在脚本的当前状态下,我得到输出:

BSSID: 02:26:16:B2:37:ADBSSID: TrimbleKanal: 6

由于awk将正确处理输出的其余部分,因此非常感谢纯awk解决方案。请注意,输出在正确的位置缺少间距,这是为了使问题尽可能紧凑和可见。

祝你好运!

6 个答案:

答案 0 :(得分:3)

使用compile 'com.google.android.gms:play-services:8.3.0' 捕获组:

gawk

逐列:

$ gawk 'match($0,/^BSSID:\s+(\S+).*Channel:\s+(\S).*SSID:\s+(.*noise:\s+\S+)/,a)\
  {print a[1],a[3],a[2]}' nett

注意 $ gawk 'match($0,/BSSID:\s+(\S+)/,a){printf(a[1]" ")} match($0,/\s+SSID:\s+(.*noise:\s+\S+)/,a){printf a[1]" "} match($0,/\s+Channel:\s+(\S+)/,a){printf a[1]"\n"}' nett 将捕获\s+SSID:\s+(.*noise:\s+\S+)SSID列之间的所有内容。

<强>结果

noise

检查00:22:07:29:D4:23 Inteno_24 noise: -70 1 00:19:77:12:97:94 AK-Gjester noise: -70 1 00:19:77:12:97:95 AK-Ansatt noise: -70 1 02:26:16:B2:37:AD Trimble Service (5132555899) noise: -87 6 FA:8F:CA:88:F9:8E Chromecast6286 noise: -87 6 00:22:07:3F:67:6B Inteno-676C noise: -87 1 documentation

答案 1 :(得分:3)

根据您发布的答案,以下是从您的问题中实现awk部分的合理方法(以及答案中的shell print '\n'):

$ cat tst.sh
awk '
BEGIN { FS=": +" }
{
    for (i=1;i<=NF;i++) {
        value = $i; sub(/ +[^ ]+$/,"",value)
        n2v[name] = value
        name = $i; sub(/.* /,"",name)
    }
    printf "SSID: %-32s",    n2v["SSID"]
    printf "BSSID: %-20s",   n2v["BSSID"]
    printf "RSSI: %s, ",     n2v["RSSI"]
    printf "noise: %s dBm ", n2v["noise"]
    printf "Kanal: %-2s\n",  n2v["Channel"]
}
END { print "" }
' nett

$ ./tst.sh
SSID: Inteno_24                       BSSID: 00:22:07:29:D4:23   RSSI: -71 dBm, noise: -70 dBm Kanal: 1
SSID: AK-Gjester                      BSSID: 00:19:77:12:97:94   RSSI: -54 dBm, noise: -70 dBm Kanal: 1
SSID: AK-Ansatt                       BSSID: 00:19:77:12:97:95   RSSI: -55 dBm, noise: -70 dBm Kanal: 1
SSID: Trimble Service (5132555899)    BSSID: 02:26:16:B2:37:AD   RSSI: -73 dBm, noise: -87 dBm Kanal: 6
SSID: Chromecast6286                  BSSID: FA:8F:CA:88:F9:8E   RSSI: -45 dBm, noise: -87 dBm Kanal: 6
SSID: Inteno-676C                     BSSID: 00:22:07:3F:67:6B   RSSI: -86 dBm, noise: -87 dBm Kanal: 13

$

以上内容适用于任何操作系统中的任何awk。

答案 2 :(得分:2)

您可以使用一些字符索引数学来提取第1行问题中SSID和噪声之间的字,并分别匹配SSID和BSSID。丑陋但它的作用: - /

#!/bin/sh
awk ' \
{for(i=1;i<=NF;i++)if($i~/BSSID:/)printf "BSSID%s", $(i+1)}
{for(i=1;i<=NF;i++)if($i~/^SSID:/){ s=index($0," SSID");e=index($0," noise"); printf "SSID:%s", substr($0,s+5,e-(s+6))}} \
{for(i=1;i<=NF;i++)if($i~/Channel:/)printf "%s%s\n", "Kanal: ", $(i+1)}' nett
输入文件样本上的

获取此输出

BSSID:: Inteno_24Kanal: 1
BSSID00:19:77:12:97:94BSSID:: AK-GjesterKanal: 1
BSSID00:19:77:12:97:95BSSID:: AK-AnsattKanal: 1
BSSID02:26:16:B2:37:ADBSSID:: Trimble Service (5132555899)Kanal: 6
BSSIDFA:8F:CA:88:F9:8EBSSID:: Chromecast6286Kanal: 6
BSSID00:22:07:3F:67:6BBSSID:: Inteno-676CKanal: 13

答案 3 :(得分:1)

gawk的

awk 'function w(m){match($0,"\\<"m": ([^ ]*) ",a);return a[1]}{print w("BSSID"),w("SSID"),w("Channel")}' file
00:22:07:29:D4:23 Inteno_24 1
00:19:77:12:97:94 AK-Gjester 1
00:19:77:12:97:95 AK-Ansatt 1
02:26:16:B2:37:AD Trimble 6
FA:8F:CA:88:F9:8E Chromecast6286 6
00:22:07:3F:67:6B Inteno-676C 13

答案 4 :(得分:1)

以下是我的最终结果:(所有评论,变量和文件名均为挪威语)。可以通过搜索&#34; #Presenter nabonettverk&#34;

来查找此特定解决方案
#!/bin/sh

# Sletter eventuelle eldre filer
if [ -e linje ] ; then rm linje ; fi
if [ -e temp_assoc ] ; then rm temp_assoc ; fi
if [ -e temp_assoc_dhcp ] ; then rm temp_assoc_dhcp ; fi
if [ -e temp_assoc_static ] ; then rm temp_assoc_static ; fi
if [ -e bssid ] ; then rm bssid ; fi
if [ -e noise ] ; then rm noise ; fi
if [ -e temp_mac ] ; then rm temp_mac ; fi
if [ -e temp_mac2 ] ; then rm temp_mac2 ; fi
if [ -e kabel_dhcp ] ; then rm kabel_dhcp ; fi
if [ -e kabel_statisk ] ; then rm kabel_statisk ; fi

# Presenter linjedata
clear
cat /etc/banner | grep STY
printf '%-72s\n' "============================== LINJE ============================="
adsl info --stats | grep -m 1 -B 1 -i "Bearer:" > linje
adsl info --stats | grep -i mode -m 1 >> linje
adsl info --stats | grep -w -A 3 -i down >> linje
adsl info --stats | grep -m 1 -i hec >> linje
adsl info --stats | grep -A 2 -i "total time" >> linje
adsl info --stats | grep -A 2 -i since >> linje
cat linje
rm linje

# Presenter tilkoblinger
printf '\n%91s\n' "=================================== TILKOBLINGER =========================================="
printf '%-13s\t%-6s%-6s\t%-6s%-6s\t%-6s%-6s\t%-6s%-6s\t%-6s%-6s\n' \
    "Grensesnitt:" \
    "WLAN:" \
    "$(if [ "$(cat /sys/class/net/wl0/operstate)" == "unknown" ] ; then printf '%s' "aktiv" ; fi)" \
    "LAN1:" \
    "$(if [ "$(cat /sys/class/net/eth4/operstate)" == "up" ] ; then printf '%s' "aktiv" ; else printf '%s' "av" ; fi)" \
    "LAN2:" \
    "$(if [ "$(cat /sys/class/net/eth3/operstate)" == "up" ] ; then printf '%s' "aktiv" ; else printf '%s' "av" ; fi)" \
    "LAN3:" \
    "$(if [ "$(cat /sys/class/net/eth2/operstate)" == "up" ] ; then printf '%s' "aktiv" ; else printf '%s' "av" ; fi)" \
    "LAN4:" \
    "$(if [ "$(cat /sys/class/net/eth1/operstate)" == "up" ] ; then printf '%s' "aktiv" ; else printf '%s' "av" ; fi)"

# Finn antall og type enheter
total=$(brctl showmacs br-lan | grep -v 00:22:07 | grep -v 02:22:07 | tail +2 | wc -l)
wlan=$(wlctl assoclistinfo | tail -n +3 | awk '{print $2}' | wc -l)
kabel=$(( $total-$wlan ))

# Presenter enheter
printf '%-30s\t%-8s%-3s\t%-6s%-3s\t%-8s%-3s\n\n' "Aktive nettverkstilkoblinger:" "Totalt:" "$total" "WLAN:" "$wlan" "Kablet:" "$kabel"
total=

# Dersom trådløs er aktivert
if [ "$(cat /sys/class/net/wl0/operstate)" == "unknown" ]
    then

    # Hent alle trådløse enheters MAC-adresse
    wlctl assoclistinfo | tail -n +3 | awk '{print $2}' > temp_assoc
    printf '%s\n' "================================================= WIRELESS =================================================="

    # Antennehastighet
    wlctl rate > wlrate 

    # Hent modemets ssid, bakgrunnsstøy og kanal. Presenter resultat.
    wlctl status | grep -B 1 -i mode | awk 'NR%2{printf $0" ";next;}1' | awk -F'"' '{print $2}' >> modem_ssid
    wlctl status | grep -B 1 -i mode | awk 'NR%2{printf $1" ";next;}1' | awk {'print " Bakgrunnstoy(noise): " $11"dBm, kanal " $14'} >> modem_ssid2
    printf '%-6s%-32s%-18s%-9s%s\n' "SSID: " "$(cat modem_ssid)" "Antennehastighet: " "$(cat wlrate)" "$(cat modem_ssid2)"
    rm modem_ssid modem_ssid2 wlrate

    # Hent andre trådløse nettverk og skriv til bssid (fil) og bakgrunnsstøy til noise (fil) dersom dette finnes
    if [ -n "$(wlctl scanresults_summary)" ]
        then
        wlctl scanresults_summary >> bssid
        wlctl scanresults | grep -B 1 -i mode | sed '/^--$/d' | awk 'NR%2{printf $1" ";next;}1' | awk '{for(i=1;i<=NF;i++)if($i~/noise:/)print $(i+1)}' >> noise
        else
        printf '\n'
    # /if [ -n "$(wlctl scanresults_summary)" ]
    fi  

    # List opp nabonettverk. Dersom bssid (fil) finnes, så finnes også noise (fil)
    if [ -e bssid ]
        then
        printf '%s\n' "Andre nettverk ----------------------------------------------------------------------------------------------"

        # Flett sammen bssid (fil) og noise (fil) til nett (fil)
        exec 6<noise
        while read -r line ; do
            read -r f2line <&6
            echo $line " noise: "$f2line >> nett
        done < bssid 
        exec 6<&-
        rm bssid noise

        # Presenter nabonettverk
        awk ' \
            {for(i=1;i<=NF;i++)if($i~/^SSID:/){ s=index($0," SSID");e=index($0," noise"); printf "SSID%-34s", substr($0,s+5,e-(s+6))}} \
            {for(i=1;i<=NF;i++)if($i~/BSSID:/)printf "BSSID: %-20s", $(i+1)} \
            {for(i=1;i<=NF;i++)if($i~/RSSI:/)printf "RSSI: %s dBm, ", $(i+1)} \
            {for(i=1;i<=NF;i++)if($i~/noise:/)printf "noise: %s dBm ", $(i+1)} \
            {for(i=1;i<=NF;i++)if($i~/Channel:/)printf "%-7s%-2s\n", "Kanal: ", $(i+1)}' nett
        printf '\n'
        rm nett
    # /if [ -e bssid ]
    fi 

    # Sorter wifi enheter i dhcp og statisk liste
    while read enhet; do
    if [ -z "$(grep -i $enhet /tmp/dhcp.leases)" ]
        then
                printf '%s\n' "$enhet" >> temp_assoc_static
        else
                printf '%s\n' "$enhet" >> temp_assoc_dhcp
    fi
    done < temp_assoc
    rm temp_assoc

    # Dersom trådløse maskiner finnes i DHCP-tabellen presenteres disse som dynamisk satte klienter
    if [ -e temp_assoc_dhcp ]
        then
        printf '%s\n' "Klienter (DHCP)----------------------------------------------------------------------------------------------"
        while read enhet; do
            printf '%-6s%-32s%-5s%-19s%-4s%-16s%-20s%-4s%-3s\n' \
                "Navn: " \
                "$(cat /tmp/dhcp.leases | grep -i $enhet | awk {'printf $4'})" \
                "MAC: " \
                "$(cat /tmp/dhcp.leases | grep -i $enhet | awk {'printf $2'})" \
                "IP: " \
                "$(cat /tmp/dhcp.leases | grep -i $enhet | awk {'printf $3'})" \
                "Signalstyrke(RSSI):" \
                "$(wlctl rssi $enhet)" \
                "dBm"
        done < temp_assoc_dhcp
        rm temp_assoc_dhcp
    fi

    # Dersom trådløse maskiner ikke finnes i DHCP-tabellen presenteres disse som statisk satte klienter
    if [ -e temp_assoc_static ]
        then
        printf '%s\n' "Klienter (Statisk) ------------------------------------------------------------------------------------------"
        while read enhet; do
        printf '%-38s%-5s%-19s%-4s%-16s%-20s%-4s%-3s\n' \
            "Statisk (navn ikke synlig)" \
            "MAC:" \
            "$enhet"  \
            "IP:" \
            "$(grep -i $enhet /proc/net/arp | awk '{print $1}')" \
            "Signalstyrke(RSSI):" \
            "$(wlctl rssi $enhet)" \
            "dBm"
        done < temp_assoc_static
        rm temp_assoc_static
    fi
# /if [ "$(cat /sys/class/net/wl0/operstate)" == "unknown" ]
fi

# Dersom minst en kablet maskin eksisterer
if [ "$kabel" -gt 0 ]
    then

    # Hent alle MAC-adresser tilkoblet
    brctl showmacs br-lan | grep -v 00:22:07 | grep -v 02:22:07 | tail -n +2 | awk '{print $2}' > temp_mac

    # Dersom minst en trådløs maskin
    if [ "$wlan" -gt 0 ]
        then
        wlctl assoclistinfo | tail -n +3 | awk '{print $2}' > temp_assoc

        # Fjern alle trådløse MAC fra den generelle listen
        while read enhet ; do
        if [ -z "$(grep -i $enhet temp_assoc)" ]
            then
            printf '%s\n' "$enhet" >> temp_mac2 
        fi
        done < temp_mac
        rm temp_mac
        rm temp_assoc
    # /if [ "$wlan" -gt 0 ]
    fi

    # Dersom ingen trådløse klienter bruker vi opprinnelig MAC liste
    if ! [ -e temp_mac2 ]
        then
        mv temp_mac temp_mac2
    fi

    # Sorter kablede enheter i dhcp og statisk liste
    while read enhet; do
    if [ -z "$(grep -i $enhet /tmp/dhcp.leases)" ]
        then
                printf '%s\n' "$enhet" >> kabel_statisk
        else
                printf '%s\n' "$enhet" >> kabel_dhcp
    fi
    done < temp_mac2
    rm temp_mac2

    # Har nå enten liste over dynamisk tildelte klienter eller statisk satte klienter eller begge deler
    printf '%-110s\n' "==================================================  KABEL ==================================================="

    # Presenter kablede dynamiske klienter
    if [ -e kabel_dhcp ]
        then
        printf '%s\n' "Klienter (DHCP)----------------------------------------------------------------------------------------------"
        while read enhet ; do
        printf '%-6s%-32s%-5s%-19s%-4s%-16s\n' \
            "Navn: " \
            "$(cat /tmp/dhcp.leases | grep -i $enhet | awk {'printf $4'})" \
            "MAC: " \
            """$(cat /tmp/dhcp.leases | grep -i $enhet | awk {'printf $2'})" \
            "IP: " \
            "$(cat /tmp/dhcp.leases | grep -i $enhet | awk {'printf $3'})"
        done < kabel_dhcp
        rm kabel_dhcp
    # /if [ -e kabel_dhcp ]
    fi

    # Presenter kablede statiske klienter
    if [ -e kabel_statisk ]
        then
        printf '%s\n' "Klienter (Statisk) ------------------------------------------------------------------------------------------"
        while read enhet ; do
        printf '%-38s%-5s%-19s%-4s%-16s\n' \
            "Statisk (navn ikke synlig)" \
            "MAC:" \
            "$enhet" \
            "IP:" \
            "$(grep -i $enhet /proc/net/arp | awk '{print $1}')"
        done < kabel_statisk
        rm kabel_statisk
    # /if [ -e kabel_statisk ]
    fi
# /if [ "$kabel" -gt 0 ]
fi

# Opprydding
kabel=
wlan=
printf '\n\n%s\n\n' "WIFI nettverkslisten kan oppdateres manuelt med 'wlctl scan', merk at dette tar 15 sekunder."

输出:

IOP Version: DG150-WU7P2U_STY2.4.11RC1-150127_1608
============================== LINJE =============================
Max:    Upstream rate = 1277 Kbps, Downstream rate = 15388 Kbps
Bearer: 0, Upstream rate = 669 Kbps, Downstream rate = 7198 Kbps
Mode:                   ADSL2+ Annex B
                Down            Up
SNR (dB):        9.4             19.2
Attn(dB):        40.5            23.2
Pwr(dBm):        19.5            13.1
HEC:            1655            0
Total time = 49 days 16 hours 25 sec
FEC:            10300984                716
CRC:            1238            0
Since Link time = 4 days 14 hours 31 min 17 sec
FEC:            824368          212
CRC:            141             0

=================================== TILKOBLINGER ==========================================
Grensesnitt:    WLAN: aktiv     LAN1: av        LAN2: aktiv     LAN3: aktiv     LAN4: aktiv
Aktive nettverkstilkoblinger:   Totalt: 3       WLAN: 0         Kablet: 3

================================================= WIRELESS ==================================================
SSID: Inteno-859A                     Antennehastighet: 144 Mbps  Bakgrunnstoy(noise): -80dBm, kanal 1
Andre nettverk ----------------------------------------------------------------------------------------------
SSID: Liverpool                       BSSID: 00:21:29:0C:6E:B8   RSSI: -61 dBm, noise: -78 dBm Kanal: 6
SSID: Liverpool1                      BSSID: B8:A3:86:55:C2:9C   RSSI: -62 dBm, noise: -78 dBm Kanal: 11
SSID: Kjetil sin Chromecast           BSSID: FA:8F:CA:76:98:6A   RSSI: -64 dBm, noise: -78 dBm Kanal: 11

==================================================  KABEL ===================================================
Klienter (DHCP)----------------------------------------------------------------------------------------------
Navn: *                               MAC: 00:21:29:0c:6e:b7  IP: 192.168.1.129
Navn: DIR-655                         MAC: b8:a3:86:55:c2:9d  IP: 192.168.1.113
Klienter (Statisk) ------------------------------------------------------------------------------------------
Statisk (navn ikke synlig)            MAC: 00:10:f3:18:c2:9f  IP: 192.168.1.20


WIFI nettverkslisten kan oppdateres manuelt med 'wlctl scan', merk at dette tar 15 sekunder.

答案 5 :(得分:0)

您可以在一个循环中处理它,只需将它们分配给变量,此方法也适用于其他情况。例如,如果数据的顺序被重新排列,那么它仍将像以前一样处理多个循环。

#!/bin/sh
awk ' \
    {for(i=1;i<=NF;i++) {
    if($i~/Channel:/)chan="Kanal: "$(i+1)
    if($i~/^SSID:/)ssid="SSID: "$(i+1)
    if($i~/^BSSID:/)bssid="BSSID: "$(i+1)
    if($i~/^noise:/)noise="noise: "$(i+1)
    if($i~/^RSSI:/)rssid="RSSI: "$(i+1)
    if ((length(chan)>1) && (length(ssid)>1) && (length(bssid)>1) && (length(noise)>1) && (length(rssid)>1)) { 
     printf "%-30s %-24s %-10s dBm, %s\n",ssid,bssid,rssid,noise,chan
     chan="";ssid="";bssid="";noise="";rssid=""
    }
} }' "${1}"  

Fyi:我拿了你的测试数据,然后把它自己放了20k倍。然后使用带有多个循环的解决方案进行测试,结果是:

cp awk_test.txt awk_test_tmp.txt; for i in {1..20000}; do cat awk_test_tmp.txt >> awk_test.txt; done

$> time selected_solution.awk awk_test.txt 1>/dev/random

real    0m5.944s
user    0m5.500s
sys 0m0.440s

$> time my.awk awk_test.txt 1>/dev/random

real    0m4.733s
user    0m4.382s
sys 0m0.347s