如何在awk脚本中使用正则表达式?

时间:2013-09-30 16:48:41

标签: regex bash awk

您好我正在尝试创建一个脚本,它将为我提供新闻组活动的摘要。到目前为止大部分工作都有效,除非我试图使用匹配运算符来查看这个$ 6字段是否与表达式匹配。我希望将所有戒指放在一个部分下。这就是我的脚本:

newsread.awk:

BEGIN{
print "\t\t\tNews Reader Summary\n\n"
printf("               %-15s%-15s%-15s%-15s\n\n","lonestar","runner","ringer","rings"); 
articles[4];
groups[4];
times[4];
cs2413[4];cs2413d[4];
}

NR == 1 {date1 = $1 " " $2 " " $3}

$6 == "lonestar.jpl.utsa.edu"{
    if ($7=="group"){
        articles[1]+=$9;
        if ($8=="utsa.cs.2413"){
            cs2413[1]+=$9;
        }
        if ($8=="utsa.cs.2413.d"){
            cs2413d[1]+=$9;
        }
    }else if ($7 == "exit"){
        articles[1]+=$9;
        groups[1]+=$11;
    }else {
        times[1]+=$13;
    }
}

$6 == "runner.jpl.utsa.edu"{
    if ($7=="group"){
                articles[2]+=$9;
         if ($8=="utsa.cs.2413"){
                        cs2413[2]+=$9;
                }
                if ($8=="utsa.cs.2413.d"){ 
                        cs2413d[2]+=$9;
                }

        }else if ($7 == "exit"){
                articles[2]+=$9;
                groups[2]+=$11;
        }else {
                times[2]+=$13;
        }

}

$6 == "ringer.cs.utsa.edu"{
    if ($7=="group"){
                articles[3]+=$9;
         if ($8=="utsa.cs.2413"){
                        cs2413[3]+=$9;
                }
                if ($8=="utsa.cs.2413.d"){ 
                        cs2413d[3]+=$9;
                }

        }else if ($7 == "exit"){
                articles[3]+=$9;
                groups[3]+=$11;
        }else {
                times[3]+=$13;
        }

}

$6 ~ "/ring??.cs.utsa.edu/"{
    if ($7=="group"){
                articles[4]+=$9;
         if ($8=="utsa.cs.2413"){
                        cs2413[4]+=$9;
                }
                if ($8=="utsa.cs.2413.d"){ 
                        cs2413d[4]+=$9;
                }

        }else if ($7 == "exit"){
                articles[4]+=$9;
                groups[4]+=$11;
        }else {
                times[4]+=$13;
        }

}
END{
    date2 = $1 " " $2 " " $3
    printf("Articles:      %-15d%-15d%-15d%-15d\n",articles[1],articles[2],articles[3],articles[4]); 
    printf("Groups:        %-15d%-15d%-15d%-15d\n",groups[1],groups[2],groups[3],groups[4]); 
    printf("Cs2413:        %-15d%-15d%-15d%-15d\n",cs2413[1],cs2413[2],cs2413[3],cs2413[4]); 
    printf("Cs2413.d:      %-15d%-15d%-15d%-15d\n",cs2413d[1],cs2413d[2],cs2413d[3],cs2413d[4]); 
    printf("User Time:     %-15d%-15d%-15d%-15d\n",times[1],times[2],times[3],times[4]);  
    printf("\nStart Time = %s\tEnd Time = %s\n",date1,date2); 

}

这是news.notice的样子片段:

Feb 13 21:27:14 ringer nnrpd[11474]: lonestar.jpl.utsa.edu group alt.education.distance 19
Feb 13 21:27:14 ringer nnrpd[11474]: lonestar.jpl.utsa.edu exit articles 19 groups 1
Feb 13 21:27:14 ringer nnrpd[11474]: lonestar.jpl.utsa.edu times user 0.470 system 0.930 elapsed 4.766
Feb 13 21:27:49 ringer nnrpd[11462]: ring42.cs.utsa.edu exit articles 0 groups 2
Feb 13 21:27:49 ringer nnrpd[11462]: ring42.cs.utsa.edu times user 2.020 system 1.430 elapsed 45.114
Feb 13 21:28:00 ringer nnrpd[11482]: lonestar.jpl.utsa.edu group utsa.lonestar 7
Feb 13 21:28:00 ringer nnrpd[11482]: lonestar.jpl.utsa.edu exit articles 7 groups 1
Feb 13 21:28:00 ringer nnrpd[11482]: lonestar.jpl.utsa.edu times user 0.520 system 0.890 elapsed 48.286
Feb 13 21:28:38 ringer innd: ME running
Feb 13 21:28:43 ringer nnrpd[11344]: lonestar.jpl.utsa.edu unrecognized NOOP
Feb 13 21:29:01 ringer nnrpd[11601]: lonestar.jpl.utsa.edu connect
Feb 13 21:29:01 ringer nnrpd[11601]: lonestar.jpl.utsa.edu exit articles 0 groups 0
Feb 13 21:29:01 ringer nnrpd[11601]: lonestar.jpl.utsa.edu times user 0.470 system 0.770 elapsed 1.456
Feb 13 21:29:03 ringer nnrpd[11602]: lonestar.jpl.utsa.edu connect
Feb 13 21:29:03 ringer nnrpd[11472]: ring29.cs.utsa.edu exit articles 0 groups 0
Feb 13 21:29:03 ringer nnrpd[11472]: ring29.cs.utsa.edu times user 1.360 system 0.790 elapsed 114.771
Feb 13 21:29:03 ringer nnrpd[11602]: lonestar.jpl.utsa.edu exit articles 0 groups 0
Feb 13 21:29:03 ringer nnrpd[11602]: lonestar.jpl.utsa.edu times user 0.530 system 0.650 elapsed 1.524
Feb 13 21:29:25 ringer nnrpd[11615]: lonestar.jpl.utsa.edu connect

我正在使用此命令:

awk -f newsread.awk news.notice > newsread.summary

这是newsread.summary:

            News Reader Summary


               lonestar       runner         ringer         rings          

Articles:      144686         25066          2              0              
Groups:        5282           8344           19             0              
Cs2413:        0              0              0              0              
Cs2413.d:      40             25             0              0              
User Time:     266197         83377          128            0              

Start Time = Feb 13 21:27:14    End Time = Feb 14 20:56:49

它必须是一个awk脚本。

2 个答案:

答案 0 :(得分:2)

首先摆脱引号,即不是这样:

$6 ~ "/ring??.cs.utsa.edu/"

但是这个:

$6 ~ /ring??.cs.utsa.edu/

引号分隔字符串,斜杠分隔常量RE。

现在,我怀疑你的RE是错的,因为??意味着前面的字符重复0或1次,然后再次相同或文字问号(不确定哪个 - 无论哪种方式都没有意义) .表示“任何单个字符”。这是一个正则表达式,而不是shell globbing - 具有不同含义的不同元字符。

你可能想要这个:

$6 ~ /^ring..\.cs\.utsa\.edu$/

答案 1 :(得分:1)

丢失双引号。

$6 ~ /regex/

$6 ~ "/regex/"