Question

我试图将一个大矩阵（每列中包含不同数量元素的制表符分隔文件）转换为数组散列。在第一步中，我成功加载文件并使用Test :: CSV将数据列转换为列表，但在这样做时，我注意到每个列表的长度是对应于较大列的元素数，即对于具有较少元素的列，存在空白空间。到目前为止，这是我的代码：

#!/usr/bin/perl
use warnings;
use strict;
use Text::CSV;

my $csv = Text::CSV->new({
    sep_char => "\t",
});

open( LIST, "<", "testfile" ) or die "No esta el archivo\n";

while (<LIST>) {
     if ($csv->parse($_)) {
    my   @columns = $csv->fields();
        print "$columns[0]\t$columns[1]\t$columns[2]","\n";
    } else {
        my $err = $csv->error_input;
    }
}
close(LIST);

输入矩阵有20列，在4500-8500行之间加上一个标题行（理想情况下我想用作散列中的键）。为简单起见，我构建了一个＆＃34; testfile＆＃34;有三列，没有标题和不同数量的元素（格式与原始输入文件相同）。这是testfile的内容：

这是输出。我认为＆＃34;使用未初始化的价值......＆＃34;与空白有关。

1  1  2
Use of uninitialized value in concatenation (.) or string at blast.cruce.especies.pl line 16, <LIST> line 3.
2  2  
  3  4
Use of uninitialized value in concatenation (.) or string at blast.cruce.especies.pl line 16, <LIST> line 5.
5  6  
Use of uninitialized value in concatenation (.) or string at blast.cruce.especies.pl line 16, <LIST> line 6.
6  7  
7  8  8
Use of uninitialized value in concatenation (.) or string at blast.cruce.especies.pl line 16, <LIST> line 8.
8  9  
Use of uninitialized value in concatenation (.) or string at blast.cruce.especies.pl line 16, <LIST> line 9.
Use of uninitialized value in concatenation (.) or string at blast.cruce.especies.pl line 16, <LIST> line 9.

Answer 1

您之所以看到这些错误，是因为您需要提供与以下内容类似的内容：

\t\t1

分析到列表：

( undef, undef, 1 )

因为您将0宽度字符拉入字段。问题不在于解析，而在于打印。如果要在解析后检查内容，请使用Data :: Printer或Data :: Dumper格式化输出行，这样就不会给出插入undef值的错误。

编辑：您的代码是正确的：

#!/usr/bin/perl
use warnings;
use strict;
use Text::CSV;

my $csv = Text::CSV->new({
  sep_char => ",",
  });

while (<DATA>) {
        if ($csv->parse($_)) {
    my   @columns = $csv->fields();
          print "$columns[0]\t$columns[1]\t$columns[2]","\n";
          } else {
        my $err = $csv->error_input;
    }
}


__DATA__
1,1,2
2,2,
,3,4
5,6,
6,7,
7,8,8
8,9,

打印：

C:\>perl testcsv.pl
1       1       2
2       2
        3       4
5       6
6       7
7       8       8
8       9

（更改只是从DATA句柄而不是文件中读取。）

Answer 2

这里又有一些问题。我将上面的代码简化为：

#!/usr/bin/perl
use warnings;
use strict;
use Text::CSV;
use Data::Dumper;
$Data::Dumper::Indent=0;

my @columns;
my %matrixhash;
my $csv = Text::CSV->new( { sep_char => "," } );

open( LIST, "<", "testfile.csv" ) or die "No esta el archivo\n";

while (<LIST>) {
    if ( $csv->parse($_) ) {
        @columns = $csv->fields();

%matrixhash=(
    a=>$columns[0],
    b=>$columns[1],
    c=>$columns[2]
);
}   
        print Dumper \@columns;
        print "\n";

}
print "Printing Hash: ",Dumper \%matrixhash;
close(LIST);

输出结果为：

$VAR1 = ['1','1','2'];
$VAR1 = ['2','2',''];
$VAR1 = ['','3','4'];
$VAR1 = ['5','6',''];
$VAR1 = ['6','7',''];
$VAR1 = ['7','8','8'];
$VAR1 = ['8','9',''];
$VAR1 = {'c' => '','a' => '8','b' => '9'};

对应于我的矩阵的行，哈希确实是最后一行。为了继续我的donwstream分析，我需要一个哈希集合，其中包含列标题作为键，矩阵中的列信息作为值，当然没有空格。例如，对于矩阵的第一列，散列将是：a =＆gt; {-1,2,5,6,7,8-）。所以，我的第二个想法是，如果我转置矩阵会发生什么？：

$VAR1 = ['1','2','','5','6','7','8'];
$VAR1 = ['1','2','3','6','7','8','9'];
$VAR1 = ['2','','4','','','8',''];
Printing Hash: $VAR1 = {'c' => '4','a' => '2','b' => ''}

这更接近我的需要，但仍然存在哈希问题。当然，我可能需要在代码中转换前面的矩阵（示例是手动转置）。

再次欢迎所有帮助。基督教。

Perl矩阵到哈希转换

2 个答案: