shell脚本grep regex在数组中

时间:2014-05-25 16:05:50

标签: arrays regex shell grep

我有这个脚本,我打印了ip无法连接的次数,以及此IP在上次尝试的日期,它看起来像这样。

#!/bin/bash
searchString=$1
file=$2

countLines()
{
    declare -A ipCount
    declare -A lastDate

    cnt=0

    while read line;
    do
        ((cnt+=1))

        ipaddr=$( echo "$line" | grep -o -E '(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)' )

         lastDate[$ipaddr]=$( echo "$line" | grep -o -E '[a-zA-Z][a-zA-Z][a-zA-Z]\ [0-3][0-9]\ [0-2][0-9]\:[0-2][2-9]\:[0-2][2-9]' )

    ((ipCount[$ipaddr]+=1))
    done

    printf "%-18s %-10s %s\n" "IP" "Count" "lastDate"
    echo "-------------------------------------------------"

    for ip in ${!ipCount[*]}
    do
        printf "%-18s %-10s %s\n" "$ip" "${ipCount[$ip]}" "${lastDate[$ip]}"
    done | sort

    echo "--------------------------------------------------"
    echo "Count: $cnt"
    }

    grep "$searchString" $file | countLines

我尝试这个的文件看起来像这样,但更大

May 16 06:41:38 aprs sshd[25951]: Failed password for root from 137.241.229.226 port 2008 ssh2
May 16 06:41:40 aprs sshd[25951]: Failed password for root from 137.241.229.226 port 2008 ssh2
May 16 06:41:43 aprs sshd[25951]: Failed password for root from 37.141.229.226 port 2008 ssh2
May 16 06:41:46 aprs sshd[25951]: Failed password for root from 37.141.229.226 port 2008 ssh2
May 16 06:41:48 aprs sshd[25951]: Failed password for root from 37.141.229.226 port 2008 ssh2

我得到的是这个

IP                 Tries      LastDate
-----------------------------------------------
37.141.229.226     205        
137.241.229.226    705        May 16 07:08:24
-----------------------------------------------
Count: 910

正如你所看到的,我只在其中一个IP上获得'lastDate',这也发生在大日志文件上,我想这很简单,但我找不到原因,你能帮帮我吗? / p>

我运行脚本如:bash scriptname.sh“root”logFile

的密码失败

1 个答案:

答案 0 :(得分:1)

问题似乎出现在lastDate的正则表达式中。替换:

lastDate[$ipaddr]=$( echo "$line" | grep -o -E '[a-zA-Z][a-zA-Z][a-zA-Z]\ [0-3][0-9]\ [0-2][0-9]\:[0-2][2-9]\:[0-2][2-9]' )

使用:

lastDate[$ipaddr]=$( echo "$line" | grep -o -E '[a-zA-Z][a-zA-Z][a-zA-Z] [0-3][0-9] [0-2][0-9]:[0-5][0-9]:[0-5][0-9]' )

关键部分是小时比赛:分钟:秒。原文有[0-2][0-9]\:[0-2][2-9]\:[0-2][2-9]。这将匹配限制为从小时的顶部到过去的一半的时间,并且还将匹配限制为仅每分钟的前半部分。更一般的替换是0-2][0-9]:[0-5][0-9]:[0-5][0-9]

此外,空格和冒号不是grep的有效字符。因此,它们不需要被转义。