在Perl数组中移动值

时间:2012-07-03 15:51:48

标签: perl

我很难以一致的格式放置数组列。我有以下输出:

Mon,Jun,25,14:39:29,2012,971,29,0,25,0,0,0,4,Mon,Jun,25,14:39:29,2012,25,mod_was_ap22_http.c
    Mon,Jun,25,14:40:29,2012,972,28,0,25,0,0,0,3,Mon,Jun,25,14:40:29,2012,3,mod_sm22.cpp,22,mod_was_ap22_http.c
    Mon,Jun,25,14:41:29,2012,973,27,0,24,0,0,0,3,Mon,Jun,25,14:41:29,2012,24,mod_was_ap22_http.c
    Mon,Jun,25,14:42:29,2012,974,26,0,20,0,0,0,6,Mon,Jun,25,14:42:29,2012,1,mod_sm22.cpp,19,mod_was_ap22_http.c
    Mon,Jun,25,14:43:29,2012,971,29,0,26,0,0,0,3,Mon,Jun,25,14:43:29,2012,2,mod_sm22.cpp,24,mod_was_ap22_http.c
    Mon,Jun,25,14:44:30,2012,957,43,0,41,0,0,0,2,Mon,Jun,25,14:44:30,2012,1,mod_sm22.cpp,40,mod_was_ap22_http.c
    Mon,Jun,25,14:45:30,2012,963,37,0,35,0,0,0,2,Mon,Jun,25,14:45:30,2012,2,mod_sm22.cpp,32,mod_was_ap22_http.c
    Mon,Jun,25,14:46:30,2012,972,28,0,24,1,1,0,2,Mon,Jun,25,14:46:30,2012,24,mod_was_ap22_http.c,1,ApacheModule.cpp
    Mon,Jun,25,14:47:30,2012,961,39,1,37,0,0,0,1,Mon,Jun,25,14:47:30,2012,37,mod_was_ap22_http.c,1,ApacheModule.cpp
    Mon,Jun,25,14:48:30,2012,968,32,0,30,0,0,0,2,Mon,Jun,25,14:48:30,2012,30,mod_was_ap22_http.c
    Mon,Jun,25,14:49:30,2012,972,28,0,25,0,0,0,3,Mon,Jun,25,14:49:30,2012,1,mod_sm22.cpp,24,mod_was_ap22_http.c

我想要显示的列:     DAYOFWEEK,月,日,时,年,RDY,BSY,RD,WR,嘉,日志,DNS,CLS,请 AP22,SM22,ApacheModule

目前粗体列不是那个顺序(其余的都是正确的)。每行与该格式不一致。该行有时首先是ap22,有时首先是sm22,有时没有或全部是三个模块。模块前面的数字与模块有关。如何将数据移动到一致的格式?

请注意,每行中的第二个日期mod_was_http.c,mod_sm22.cpp和ApacheModule.cpp将在最终数组中删除。

到目前为止,这是我的代码:

# This program parses a error log for necessary information and outputs in CSV format.

# chunks of your input to ignore, see below... 
my %ignorables = map { $_ => 1 } qw([notice mpmstats: rdy bsy rd wr ka log dns cls bsy: in);  

# 3-arg open is safer than 2, lexical my $fh better than a global FH glob 
open my $error_fh, '<', 'iset_error_log';   

sub findLines {
    my($item,@result)=("");
    # Iterates over the lines in the file, putting each into $_ 
    while (<$error_fh>) {      

        # Select only those fields that have the word 'notice'
        if (/\[notice/) {          

            # Place those lines with the word 'rdy' on the next line
            if (/\brdy\b/){
                push @result,"$item\n";
                $item="";

            }
            else {
                $item.=",";
            }

            # Split the line into fields, separated by spaces, skip the %ignorables         
            my @line = grep { not defined $ignorables{$_} } split /\s+/;    

            # More cleanup         
            s/|^\[|notice|[]]//g for @line; # remove unnecessary elements from the array

            # Output the line.  
            @line = join(",", @line);          
            s/,,/,/g for @line;
            map $item.=$_, @line;
            }
        } 
        @result
    }  

my @array = &findLines;
foreach $line (@array){
    print $line; #This is where I would like to organize the lines if possible.
}

我的输入文件如下所示:

[Mon Jun 25 07:51:17 2012] [notice] mpmstats: rdy 990 bsy 10 rd 0 wr 7 ka 0 log 0 dns 0 cls 3
[Mon Jun 25 07:51:17 2012] [notice] mpmstats: bsy: 2 in mod_sm22.cpp, 5 in mod_was_ap22_http.c
[Mon Jun 25 08:08:17 2012] [notice] mpmstats: rdy 974 bsy 26 rd 1 wr 24 ka 0 log 0 dns 0 cls 1
[Mon Jun 25 08:08:17 2012] [notice] mpmstats: bsy: 1 in mod_sm22.cpp, 23 in mod_was_ap22_http.c, 1 in ApacheModule.cpp        Mon,Jun,25,14:38:29,2012,962,38,0,36,0,0,0,2,Mon,Jun,25,14:38:29,2012,3,mod_sm22.cpp,33,mod_was_ap22_http.c

    [Mon Jun 25 21:54:41 2012] [notice] mpmstats: rdy 999 bsy 1 rd 0 wr 0 ka 0 log 0 dns 0 cls 1
    [Mon Jun 25 21:55:41 2012] [notice] mpmstats: rdy 999 bsy 1 rd 0 wr 0 ka 0 log 0 dns 0 cls 1
    [Mon Jun 25 21:59:41 2012] [notice] mpmstats: rdy 999 bsy 1 rd 0 wr 1 ka 0 log 0 dns 0 cls 0
    [Mon Jun 25 21:59:41 2012] [notice] mpmstats: bsy: 1 in mod_was_ap22_http.c
    [Mon Jun 25 22:00:41 2012] [notice] mpmstats: rdy 999 bsy 1 rd 0 wr 1 ka 0 log 0 dns 0 cls 0
    [Mon Jun 25 22:00:41 2012] [notice] mpmstats: bsy: 1 in mod_was_ap22_http.c
    [Mon Jun 25 22:03:41 2012] [notice] mpmstats: rdy 998 bsy 2 rd 0 wr 2 ka 0 log 0 dns 0 cls 0
    [Mon Jun 25 22:03:41 2012] [notice] mpmstats: bsy: 2 in mod_was_ap22_http.c
    [Mon Jun 25 22:08:42 2012] [notice] mpmstats: rdy 998 bsy 2 rd 0 wr 2 ka 0 log 0 dns 0 cls 0
    [Mon Jun 25 22:08:42 2012] [notice] mpmstats: bsy: 2 in mod_was_ap22_http.c
    [Mon Jun 25 22:21:42 2012] [notice] mpmstats: rdy 999 bsy 1 rd 0 wr 1 ka 0 log 0 dns 0 cls 0
    [Mon Jun 25 22:21:42 2012] [notice] mpmstats: bsy: 1 in mod_was_ap22_http.c
    [Mon Jun 25 22:22:42 2012] [notice] mpmstats: rdy 999 bsy 1 rd 0 wr 1 ka 0 log 0 dns 0 cls 0
    [Mon Jun 25 22:22:42 2012] [notice] mpmstats: bsy: 1 in mod_was_ap22_http.c
    [Mon Jun 25 22:31:42 2012] [notice] mpmstats: rdy 999 bsy 1 rd 0 wr 0 ka 0 log 0 dns 0 cls 1
    [Mon Jun 25 22:32:42 2012] [notice] mpmstats: rdy 999 bsy 1 rd 0 wr 1 ka 0 log 0 dns 0 cls 0
    [Mon Jun 25 22:32:42 2012] [notice] mpmstats: bsy: 1 in mod_was_ap22_http.c
    [Mon Jun 25 23:06:43 2012] [notice] mpmstats: rdy 999 bsy 1 rd 0 wr 1 ka 0 log 0 dns 0 cls 0
    [Mon Jun 25 23:06:43 2012] [notice] mpmstats: bsy: 1 in mod_was_ap22_http.c

1 个答案:

答案 0 :(得分:0)

在使用连接将它们重新转换为一行文本之前,您可能希望在仍然进行拆分时对列重新排序。

你只需要进行交换。

# 0        ,1    ,2  ,3   ,4   ,5  ,6  ,7 ,8 ,9 ,10 ,11 ,12 ,13  ,14  ,15
# DayOfWeek,Month,Day,Time,Year,Rdy,Bsy,Rd,Wr,Ka,Log,Dns,Cls,AP22,SM22,ApacheModule
#
# Sometimes the last 2 fields are missing and 13 comes before 14 and 15 in the 
# input, so fix that.
if (@line < 16) {
    push @line, '', ''; # or whatever you want for blanks
}

@line = @line[0..12,14,15,13]; # rearrange the array

此外,如果使用空字符串(s/,,/,/g)作为空字段,则正则表达式''将打破此问题。缺少最后一个字段的短线将重新缺少错过正确的13和14字段。

根据此处和之前提出的问题类型,我强烈建议您获取Modern Perl(可供下载或购买)或Learning Perl的副本,以便更好地掌握语言整个。我最近阅读了很多前者,并且很喜欢它,并从后者的早期版本中获得了我最初的Perl知识。