Question

我试图在2 D数组中读取一个巨大的CSV文件，必须有更好的方法来分割线并将其保存在2D阵列中一步：s 干杯

my $j = 0;
while (<IN>) 
{

    chomp ;
    my @cols=();
    @cols   = split(/,/); 
    shift(@cols) ; #to remove the first number which is a line header
    for(my $i=0; $i<11; $i++) 
    {
       $array[$i][$j]  = $cols[$i];
    }        
    $j++;    
}

Answer 1

CSV不是一件容易的事。不要自己解析。使用像Text::CSV这样的模块，它可以正确快速地完成。

use strict;
use warnings;

use Text::CSV;

my @data;   # 2D array for CSV data
my $file = 'something.csv';

my $csv = Text::CSV->new;
open my $fh, '<', $file or die "Could not open $file: $!";

while( my $row = $csv->getline( $fh ) ) { 
    shift @$row;        # throw away first value
    push @data, $row;
}

这将在@data中很好地获取所有行，而不必担心自己解析CSV。

Answer 2

如果你发现自己正在寻找C风格的循环，那么你的程序设计很有可能得到改善。

while (<IN>) {
    chomp;

    my @cols = split(/,/); 
    shift(@cols); #to remove the first number which is a line header

    push @array, \@cols;
}

这假定您有一个可以使用简单split处理的CSV文件（即记录中不包含嵌入的逗号）。

Answer 3

旁白：您可以使用以下方法简化代码：

my @cols = split /,/;

您对$array[$col][$row]的作业使用了不寻常的下标顺序;它使生活变得复杂。根据数组中的列/行分配顺序，我认为没有更简单的方法。

替代： 如果你要颠倒数组中的下标顺序（$array[$row][$col]），你可以考虑使用：

use strict;
use warnings;

my @array;
for (my $j = 0; <>; $j++) # For testing I used <> instead of <IN>
{
    chomp;
    $array[$j] = [ split /,/ ];
    shift @{$array[$j]};   # Remove the line label
}

for (my $i = 0; $i < scalar(@array); $i++)
{
    for (my $j = 0; $j < scalar(@{$array[$i]}); $j++)
    {
        print "array[$i,$j] = $array[$i][$j]\n";
    }
}

样本数据

label1,1,2,3
label2,3,2,1
label3,2,3,1

示例输出

array[0,0] = 1
array[0,1] = 2
array[0,2] = 3
array[1,0] = 3
array[1,1] = 2
array[1,2] = 1
array[2,0] = 2
array[2,1] = 3
array[2,2] = 1

读取CSV文件并保存为2 d阵列

3 个答案: