Question

我有以下代码：

#!/usr/bin/perl
# splits.pl

use strict;
use warnings;
use diagnostics;

my $pivotfile = "myPath/Internal_Splits_Pivot.txt";

open PIVOTFILE, $pivotfile or die $!;

while (<PIVOTFILE>) { # loop through each line in file

    next if ($. == 1); # skip first line (contains business segment code)
    next if ($. == 2); # skip second line (contains transaction amount text)

    my @fields = split('\t',$_);  # split fields for line into an array     

    print scalar(grep $_, @fields), "\n"; 

}

鉴于文本文件中的数据是：

    4   G   I   M   N   U   X
    Transaction Amount  Transaction Amount  Transaction Amount  Transaction Amount  Transaction Amount  Transaction Amount  Transaction Amount
0000-13-I21             600         
0001-8V-034BLA              2,172   2,172       
0001-8V-191GYG                  13,125      4,375
0001-9W-GH5B2A  -2,967.09       2,967.09    25.00

考虑到每行中已定义元素的数量，我希望perl脚本的输出为：2 3 3 4。该文件是一个带有8列的制表符分隔文本文件。

相反，我得到3 4 3 4，我不明白为什么！

对于后台，我使用Counting array elements in Perl作为我开发的基础，因为我正在尝试计算行中元素的数量，以确定是否需要跳过该行。

Answer 1

问题应该在这一行：

my @fields = split('\t',$_);  # split fields for line into an array

制表符未进行插值。并且您的文件似乎不是以制表符分隔的，至少在这里是SO。我更改了拆分正则表达式以匹配任意空格，在我的机器上运行代码并获得“正确”结果：

my @fields = split(/\s+/,$_);  # split fields for line into an array

结果：

Answer 2

我怀疑你在某些地方有空格与标签混合，你的grep测试会认为“”是真的。

做什么：

use Data::Dumper;
$Data::Dumper::Useqq=1;
print Dumper [<PIVOTFILE>];

显示？

Answer 3

不仅有标签，还有空格。

尝试通过空间分裂进行分割看下面

#!/usr/bin/perl
# splits.pl

use strict;
use warnings;
use diagnostics;



while (<DATA>) { # loop through each line in file

    next if ($. == 1); # skip first line (contains business segment code)
    next if ($. == 2); # skip second line (contains transaction amount text)


    my @fields = split(" ",$_);  # split fields by SPACE     

    print scalar(@fields), "\n"; 

}

__DATA__
    4   G   I   M   N   U   X
    Transaction Amount  Transaction Amount  Transaction Amount  Transaction Amount  Transaction Amount  Transaction Amount  Transaction Amount
0000-13-I21             600         
0001-8V-034BLA              2,172   2,172       
0001-8V-191GYG                  13,125      4,375
0001-9W-GH5B2A  -2,967.09       2,967.09    25.00

输出

Answer 4

作为旁注：

对于后台，我使用Counting array elements in Perl作为我开发的基础，因为我正在尝试计算行中元素的数量，以确定是否需要跳过该行。

现在我明白了为什么使用grep来计算数组元素。当您的数组包含未定义的值（如下所示）时，这很重要：

my @a;
$a[1] = 42;      # @a contains the list (undef, 42)
say scalar @a;   # 2

或手动删除条目时：

my @a = split /,/ => 'foo,bar';    # @a contains the list ('foo', 'bar')
delete $a[0];                      # @a contains the list (undef, 'bar')
say scalar @a;                     # 2

但在许多情况下，尤其是当您使用数组只存储列表而不对单个数组元素进行操作时，scalar @a可以完全正常。

my @a = (1 .. 17, 1 .. 25);        # (1, 2, ..., 17, 1, 2, .., 25)
say scalar @a;                     # 42

了解grep的作用非常重要！在你的情况下

print scalar(grep $_, @fields), "\n";

grep返回@fields的 true 值列表，然后打印您拥有的数量。但有时这不是你想要/期望的：

my @things = (17, 42, 'foo', '', 0);  # even '' and 0 are things
say scalar grep $_ => @things         # 3!

因为空字符串和数字0在Perl中是假值，所以它们不会被该成语计算。因此，如果您想知道数组的长度，请使用

say scalar @array; # number of array entries

如果您想计算 true 值，请使用此

say scalar grep $_ => @array; # number of true values

但是如果你想计算定义的值，请使用此

say scalar grep defined($_) => @array; # number of defined values

我很确定你已经从链接页面上的其他答案中了解到这一点。在哈希中，情况稍微复杂一些，因为将某些内容设置为undef与delete不同：

my %h = (a => 0, b => 42, c => 17, d => 666);
$h{c} = undef;   # still there, but undefined
delete $h{d};    # BAM! $h{d} is gone!

当我们尝试计算值时会发生什么？

say scalar grep $_ => values %h;   # 1

因为42是%h中唯一的 true 值。

say scalar grep defined $_ => values %h;   # 2

因为0是定义的，但它是假的。

say scalar grep exists $h{$_} => qw(a b c d);   # 3

因为未定义的值可以存在。结论：

知道你在做什么而不是复制'n'pasting代码片段：）

Answer 5

您的代码works for me。问题可能是输入文件包含一些“隐藏”的空白字段（例如，除了制表符之外的其他空格）。例如

A<tab><space><CR>提供两个字段A和<space><CR>
A<tab>B<tab><CR>提供三个，A，B，<CR>（请记住，行尾是输入的一部分！）

我建议您使用chomp每一行;除此之外，您将不得不从仅限空格的字段中清除数组。例如

scalar(grep /\S/, @fields)

应该这样做。

Answer 6

在这个问题上有很多很棒的帮助，而且很快！

经过漫长而漫长的学习过程，这就是我想出来的，效果很好，有预期的结果。

#!/usr/bin/perl
# splits.pl

use strict;
use warnings;
use diagnostics;

my $pivotfile = "myPath/Internal_Splits_Pivot.txt";

open PIVOTFILE, $pivotfile or die $!;

while (<PIVOTFILE>) { # loop through each line in file

    next if ($. == 1); # skip first line (contains business segment code)
    next if ($. == 2); # skip second line (contains transaction amount text)

    chomp $_; # clean line of trailing \n and white space

    my @fields = split(/\t/,$_);  # split fields for line into an array     

    print scalar(grep $_, @fields), "\n"; 

}

Perl grep没有返回预期值

6 个答案: