我有一个如下所示的CSV文件:
account,name,,,,type,"$a,mount.00",description
account,name,so,me,thing,type,$amount,"description
account,name,so,me,thing
account,name,so,me,thing,type,$amount,"description"
基本上,我想清理整个文件,我认为最简单的方法就是只为每一列添加引号,并确保每行有13列。唯一的问题是,有些列有开放引号,但没有收尾报价。这似乎只发生在行的末尾,但文件太大,我无法完全验证。
通过Perl对此进行消毒的最佳方法是什么?
谢谢! - 马特
答案 0 :(得分:1)
您可以使用Text::CSV加载文件并让它处理清理。它非常擅长。
use strict;
use warnings;
use Text::CSV;
my @rows;
my $csv = Text::CSV->new ({
binary => 1,
allow_loose_quotes => 1,
always_quote => 1
});
while ( my $row = $csv->getline( \*DATA ) ) {
push @rows, $row;
}
$csv->eol ("\n");
$csv->print(\*STDOUT, $_) for @rows;
__DATA__
account,name,,,,type,"$a,mount.00",description
account,name,so,me,thing,type,$amount,"description
account,name,so,me,thing
account,name,so,me,thing,type,$amount,"description"
它将产生以下输出:
"account","name","","","","type","$a,mount.00","description"
"account","name","so","me","thing","type","$amount","""description"
"account","name","so","me","thing"
"account","name","so","me","thing","type","$amount","description"
请注意所有字段的引用方式。它将第二行中的单个(未闭合)双引号视为字面引用,而不是对该字段进行未公开引用并将其转义。默认情况下,它使用双引号作为转义字符。我这样离开了,但您可以通过设置$csv->escape_char('\\')
或类似内容来更改它。