使用Perl格式化日志文件

时间:2014-05-22 14:52:57

标签: regex perl file-io substitution

我有一个文件,其中包含格式的日志,无法通过增加页面宽度从机器端修复。剩下的唯一选择是将其收集到一个文件中并编辑文件。

cat failure.txt

04-05-22 12:57:38 \GINGER.$VOLS01   COBULED.ROM.H01       005056  LDEV 0222 File
                                    $VOLS01.STEPHAN.TABLED, has been stopped 
                                    due to a processing error.
04-05-22 12:57:39 \GINGER.$VOLS02   COBULED.ROM.H01       005056  LDEV 0221 File
                                    $VOLS02.STEPHAN.TABLED, has been altered 
                                    due to a processing error.
04-05-22 12:57:40 \GINGER.$VOLS08   COBULED.ROM.H01       005056  LDEV 0216 File
                                    $VOLS08.STEPHAN.TABLED, has been rolled back 
                                    due to a processing error.

我写了一个简单的perl程序

    open $read_failed_log, '<', failure.txt or die "Could not open due to $!";

    open $write_failed_log, '>', failure_formated or die "Could not open due to $!";


    while($x = <$read_failed_log>){

        if(grep /^\S/,$x){
            print $write_failed_log "\n";                 
            print $write_failed_log $x;
            }else{
                print $write_failed_log $x;
            }
            }


    close $read_failed_log;
    close $write_failed_log;

但这并没有给出如下所需的输出: 期望的输出:

cat failure_formated.txt

04-05-22 12:57:38 \GINGER.$VOLS01   COBULED.ROM.H01       005056  LDEV 0222 File $VOLS01.STEPHAN.TABLED, has been stopped due to a processing error.
04-05-22 12:57:39 \GINGER.$VOLS02   COBULED.ROM.H01       005056  LDEV 0221 File $VOLS02.STEPHAN.TABLED, has been altered due to a processing error.
04-05-22 12:57:40 \GINGER.$VOLS08   COBULED.ROM.H01       005056  LDEV 0216 File $VOLS08.STEPHAN.TABLED, has been rolled back due to a processing error.

通常需要的是日志行不应该破坏。每个日志条目都在一行中,fail_formated.txt显示在上面。

2 个答案:

答案 0 :(得分:1)

它会选择新行,并在以数字开头时将一行放在行的开头。

while (my $x = <$read_failed_log>) {
    chomp($x);
    $x =~ s/^(?=\d)/\n/;
    print $write_failed_log $x;
}

一衬垫,

perl -pe 'chomp; s/^(?=\d)/\n/' failure.txt > failure_formated

答案 1 :(得分:0)

chomp修复了你的行返回,但是你在两个字段之间有很多空格,所以它们就会回绕。尝试修剪部分字符串(参见下面的.e.g)

open $read_failed_log, '<', "failure.txt" or die "Could not open: $!";       
open $write_failed_log, '>', "failure_formated" or die "Could not open:  $!";

while(my $x = <$read_failed_log>){  
    chomp $x;           # chomp the lines being read
    $x =~ s/ {35}//gm;  # remove blocks of > 35 white spaces
    if(grep /^\S/,$x){ 
        print $write_failed_log "\n";
        print $write_failed_log $x;      
        }
        else
        {                                                                     
            print $write_failed_log $x;                                         
        }                                                                       
}                                                                       

close $read_failed_log;                                                         
close $write_failed_log;   

在解析/打印之前以这种方式清理数据文件可能是一个班轮的工作(请参阅@ mpapec的答案提示)。希望有所帮助。