在Perl中,如何解析CSV文件,其中字段包含逗号分隔值?

时间:2015-04-25 14:15:40

标签: perl csv split

有一个名为csv的{​​{1}}文件,如下所示:(这只是一个例子)

file.csv

在这个文件中有5个标题但是少数标题的值更多。任何标题都可以包含任意数量的值,意味着超过2或3。

我想获得这样的输出:

"Name","Alias","Phone","email","address"
"rob","rob","534235","rob@example.com","US,UK"
"nik","nik","976784","nik@example.com,nik@foram.org","UK"
"picy","pic","327654,823747","pic@example.com","US"

或任何特定列,但对于该列数据将如上所述。

我知道Name Nickname Phone email address rob rob 534235 rob@example.com US,UK nik nik 976784 nik@example.com,nik@foram.org UK picy pic 327654,823747 pic@example.com US 中的split功能和限制:

spliting

但这不起作用。

我怎样才能做到这一点?有什么想法吗?

1 个答案:

答案 0 :(得分:0)

通过更新的信息,您的问题的解决方案现在变得微不足道了:

#!/usr/bin/env perl

use strict;
use warnings;

use Text::CSV_XS;
use Text::Table::Tiny;

my $csv = Text::CSV_XS->new;

my @data = ( $csv->getline(\*DATA) ); #header

while (my $row = $csv->getline(\*DATA)) {
    next unless @$row == @{ $data[0] };
    push @data, $row;
}

print Text::Table::Tiny::table(
    rows => \@data,
    header_row => 1,
);

__DATA__
"Name","Alias","Phone","email","address"
"rob","rob","534235","rob@example.com","US,UK"
"nik","nik","976784","nik@example.com,nik@foram.org","UK"
"picy","pic","327654,823747","pic@example.com","US"

输出:

+------+-------+---------------+-------------------------------+---------+
| Name | Alias | Phone         | email                         | address |
+------+-------+---------------+-------------------------------+---------+
| rob  | rob   | 534235        | rob@example.com               | US,UK   |
| nik  | nik   | 976784        | nik@example.com,nik@foram.org | UK      |
| picy | pic   | 327654,823747 | pic@example.com               | US      |
+------+-------+---------------+-------------------------------+---------+

您还可以使用CSV解析器解析每一行以及每行中的字段来创建嵌套数据结构:

while (my $row = $csv->getline(\*DATA)) {
    next unless @$row == @{ $data[0] };
    push @data, [
        map [ $csv->parse($_) ? $csv->fields : () ], @$row
    ];
}

如果您的主要兴趣是处理数据,而不仅仅是将数据打印出来,这将非常有用。