Question

我有一个日志文件需要正确格式化为可读格式。但是，文本文件没有静态行数或固定主值，并且具有随机数量的空格，但只有一个日志文件头，可用于指出每次应用程序记录时的开始和结束。

日志文件的示例：

Log File header
<text>
<text>
Log File header
<text>

脚本格式化后应该如下所示：

Log File header
<text>
<text>

<space>

Log File header
<text>
<text>

因此，每当Perl脚本检测到“日志文件头”时，我都需要一些关于整理段落的建议。

这是grep perl脚本：

#!/usr/bin/perl

#use 5.010; # must be present to import the new 5.10 functions, notice 
#that it is 5.010 not 5.10

my $file = "/root/Desktop/Logfiles.log";
open LOG, $file or die "The file $file has the error of:\n =>  $!";

@lines = <LOG>;
close (LOG);

@array = grep(/Log File header/, @lines);

print @array;

有人可以就代码提出一些建议吗？感谢。

Answer 1

所以你只想在日志文件部分之间留下垂直空间？

有一些方法，特别是因为您知道标题将位于完全独立的行上。在以下所有示例中，假设已从输入文件中填充@lines。

所以第一种技术：在标题之前插入空格：

foreach my $line ( @lines ) {
    if ( $line =~ m/Log File header/ ) {
        print( "\n\n\n" ); # or whatever you want <space> to be
    }

    print( $line );
}

下一个技巧是使用正则表达式来搜索/替换文本块：

my $space = "\n\n\n"; # or whatever you want <space> to be
my $everything = join( "", @lines );
$everything =~ s/(Log File header.*?)(?=Log File header)/$1$space/sg;
print( $everything );

关于正则表达式的一些解释。 (?=表示“预见”，它将匹配但不构成要替换的表达式的一部分。 /sg修饰符表示s - 将换行符视为普通空格和g - 执行全局搜索和替换。 .*?表示选择任何内容，但尽可能少地满足表达式（非贪婪），这在此应用程序中非常重要。

update ：编辑后的第一种技术，我无法明确指定要匹配的变量。

我如何在Perl中使用grep？

1 个答案: