使用Text :: CSV_XS拆分功能

时间:2014-09-13 16:32:44

标签: regex perl csv

我正在尝试解析日志文件并将其转换为.csv文件。我在分割功能方面遇到了麻烦。例如,我在日志文件中有以下内容: 21a94551,00:00:59.643;错误; 。当我尝试拆分逗号(,)和半冒号(;)时,我从输出csv文件中的时间戳中丢失.643。我想保持时间(00:00:59.643)不变。我在日志文件中有多行(所有行都有不同的数字),因此这些值不明确。

当我在分割功能后使用打印功能时,值会输出到屏幕确定,但在CSV文件中

我是Perl的新手。有人可以解释一下我做错了什么吗?我认为问题可能在于如何处理字符串?

use strict;
use Cwd;
use Excel::Writer::XLSX;
use Text::CSV_XS;
use Spreadsheet::Read;

my $dirname = getcwd;               # Set the directory to current working directory.
opendir (DIR, $dirname) || die;     # Open the current directory
my @FileNameList = readdir(DIR);    # Load the names of files in to an array

foreach (@FileNameList)             #Read each of the file names
{
    my $FileName = $_;
    my $Output;

    if ($FileName =~ m/iusp_\d+.log/)
        {
        print ("\n". $FileName." \n Correct Log File Found");

open (my $file, "<", $FileName);

while (<$file>) {
        chomp;    # Remove the \n from the last field
        my $Line = $_;    # Create the variable SLine and place the contents of the current line there

        if ( $Line =~ m/ERROR/ )    # Select any line that has "ERROR" inside it.
        {
            my @fields = split /[,;]/, $Line;    # Split up the line $Line by ", ;"
            my $csv = Text::CSV_XS->new();         # Create new CSV
            $csv->combine(@fields);
            my $csvLine = $csv->string();
            print $csvLine, "\n";
            {
                $Output = $csvLine . "\n";
            }
            my $OutputFileName = $FileName . ".csv";
            print( "\n Saving File:" . $OutputFileName );
            open( MyOutputFile, ">>$OutputFileName" );
            print MyOutputFile $Output;
        }    #End of IF Statement
    }    #End of while statement

2 个答案:

答案 0 :(得分:6)

简化你的正则表达式。您不需要.*perldoc -f split)。点被split视为分隔符,因为它位于字符类方括号内。

use warnings;
use strict;
use Data::Dumper;

my $Line = '21a94551,00:00:59.643;ERROR;';
my @fs = split /[,;]/, $Line;
print Dumper(\@fs);

__END__
$VAR1 = [
          '21a94551',
          '00:00:59.643',
          'ERROR'
        ];

答案 1 :(得分:2)

[]内的内容不是正则表达式,它是一组字符或字符范围或类。当您只想分开,或{{1}时,您已告诉它要分为.*;, }:;