如何将逗号分隔字段视为perl中的单个字段

时间:2015-11-20 18:51:57

标签: perl

我有一个输入CSV文件。

model,taccode,               date

1001,234ghy,               20151120

1002,hj3456,531908jh       20151120

1003,56789ui,78fgby,34s2fg,20151120

当我使用,作为分隔符解析CSV文件时,我可以将行插入到临时表中。

while (<CSVFILE>) {
                $line = $_;
                ($field1,$field2,$startDate)=split(/,/,trim($line));

                if($field2 eq "")
                {
                    $field2=0;
                }

                $Loaddatainsertsql->bind_param(1,$field1);
                $Loaddatainsertsql->bind_param(2,$field2);
                $Loaddatainsertsql->bind_param(3,$startDate);
                $Loaddatainsertsql->execute ();
}

表格中的数据输出:

model  taccode                date

1001   234ghy                20151120

1002   hj3456,531908jh        20151120

1003   56789ui,78fgby,34s2fg  20151120 

任何人都可以帮助我。

1 个答案:

答案 0 :(得分:6)

最佳解决方案是将您的值括在引号中,并使用Text::CSV正确解析文件。如果您无法控制原始数据源并且以某种方式锚定列(例如,前两个,前三个,或者在您的情况下是第一个和最后一个),您可以执行以下操作:

use strict;
use warnings;

while (<DATA>) {
    chomp;
    my @fields = split(/,/, $_);

    my $first  = shift(@fields);
    my $last   = pop(@fields);
    my $middle = join(',', @fields);

    printf("%-16s%-59s%s\n", $first, $middle, $last);
}

__DATA__
lifelock,LifeLock,,web,Tempe,AZ,1-May-07,6850000,USD,b
lifelock,LifeLock,,web,Tempe,AZ,1-Oct-06,6000000,USD,a
lifelock,LifeLock,,web,Tempe,AZ,1-Jan-08,25000000,USD,c
mycityfaces,MyCityFaces,7,web,Scottsdale,AZ,1-Jan-08,50000,USD,seed
flypaper,Flypaper,,web,Phoenix,AZ,1-Feb-08,3000000,USD,a
infusionsoft,Infusionsoft,105,software,Gilbert,AZ,1-Oct-07,9000000,USD,a
gauto,gAuto,4,web,Scottsdale,AZ,1-Jan-08,250000,USD,seed
chosenlist-com,ChosenList.com,5,web,Scottsdale,AZ,1-Oct-06,140000,USD,seed
chosenlist-com,ChosenList.com,5,web,Scottsdale,AZ,25-Jan-08,233750,USD,angel
digg,Digg,60,web,San Francisco,CA,1-Dec-06,8500000,USD,b
digg,Digg,60,web,San Francisco,CA,1-Oct-05,2800000,USD,a
facebook,Facebook,450,web,Palo Alto,CA,1-Sep-04,500000,USD,angel
facebook,Facebook,450,web,Palo Alto,CA,1-May-05,12700000,USD,a
facebook,Facebook,450,web,Palo Alto,CA,1-Apr-06,27500000,USD,b
facebook,Facebook,450,web,Palo Alto,CA,1-Oct-07,300000000,USD,c
facebook,Facebook,450,web,Palo Alto,CA,1-Mar-08,40000000,USD,c
facebook,Facebook,450,web,Palo Alto,CA,15-Jan-08,15000000,USD,c
facebook,Facebook,450,web,Palo Alto,CA,1-May-08,100000000,USD,debt_round

输出:

lifelock        LifeLock,,web,Tempe,AZ,1-May-07,6850000,USD                b
lifelock        LifeLock,,web,Tempe,AZ,1-Oct-06,6000000,USD                a
lifelock        LifeLock,,web,Tempe,AZ,1-Jan-08,25000000,USD               c
mycityfaces     MyCityFaces,7,web,Scottsdale,AZ,1-Jan-08,50000,USD         seed
flypaper        Flypaper,,web,Phoenix,AZ,1-Feb-08,3000000,USD              a
infusionsoft    Infusionsoft,105,software,Gilbert,AZ,1-Oct-07,9000000,USD  a
gauto           gAuto,4,web,Scottsdale,AZ,1-Jan-08,250000,USD              seed
chosenlist-com  ChosenList.com,5,web,Scottsdale,AZ,1-Oct-06,140000,USD     seed
chosenlist-com  ChosenList.com,5,web,Scottsdale,AZ,25-Jan-08,233750,USD    angel
digg            Digg,60,web,San Francisco,CA,1-Dec-06,8500000,USD          b
digg            Digg,60,web,San Francisco,CA,1-Oct-05,2800000,USD          a
facebook        Facebook,450,web,Palo Alto,CA,1-Sep-04,500000,USD          angel
facebook        Facebook,450,web,Palo Alto,CA,1-May-05,12700000,USD        a
facebook        Facebook,450,web,Palo Alto,CA,1-Apr-06,27500000,USD        b
facebook        Facebook,450,web,Palo Alto,CA,1-Oct-07,300000000,USD       c
facebook        Facebook,450,web,Palo Alto,CA,1-Mar-08,40000000,USD        c
facebook        Facebook,450,web,Palo Alto,CA,15-Jan-08,15000000,USD       c
facebook        Facebook,450,web,Palo Alto,CA,1-May-08,100000000,USD       debt_round

如果从一开始就正确引用了您的字段,使用Text :: CSV就可以解决这个问题,尽管在引用CSV时正确的含义是up for debate。无论如何,用双引号括起字段非常清楚:

use strict;
use warnings;

use Data::Dump;
use Text::CSV;

my $csv = Text::CSV->new;

while (<DATA>) {
    $csv->parse($_);
    my @fields = $csv->fields;
    dd(\@fields);
}

__DATA__
lifelock,"LifeLock,,web,Tempe,AZ,1-May-07,6850000,USD",b
lifelock,"LifeLock,,web,Tempe,AZ,1-Oct-06,6000000,USD",a
lifelock,"LifeLock,,web,Tempe,AZ,1-Jan-08,25000000,USD",c
mycityfaces,"MyCityFaces,7,web,Scottsdale,AZ,1-Jan-08,50000,USD",seed
flypaper,"Flypaper,,web,Phoenix,AZ,1-Feb-08,3000000,USD",a
infusionsoft,"Infusionsoft,105,software,Gilbert,AZ,1-Oct-07,9000000,USD",a

输出:

["lifelock", "LifeLock,,web,Tempe,AZ,1-May-07,6850000,USD", "b"]
["lifelock", "LifeLock,,web,Tempe,AZ,1-Oct-06,6000000,USD", "a"]
[
  "lifelock",
  "LifeLock,,web,Tempe,AZ,1-Jan-08,25000000,USD",
  "c",
]
[
  "mycityfaces",
  "MyCityFaces,7,web,Scottsdale,AZ,1-Jan-08,50000,USD",
  "seed",
]
[
  "flypaper",
  "Flypaper,,web,Phoenix,AZ,1-Feb-08,3000000,USD",
  "a",
]
[
  "infusionsoft",
  "Infusionsoft,105,software,Gilbert,AZ,1-Oct-07,9000000,USD",
  "a",
]