Parse :: FixedLength修剪问题

时间:2016-01-06 00:53:52

标签: perl

在我之前的question我曾经问过如何避免Parse :: FixedLength修剪所有零。代码@bolav建议使用我正在使用的示例数据但不知何故它似乎不适用于我的新数据。

它似乎应该可以工作但不知何故它正在为这些数据修剪所有零。我很可能犯了一个非常明显的错误,但我无法弄清楚它是什么。感谢您的帮助。

    #!/usr/bin/perl

    use strict;
    use warnings; 
    use Parse::FixedLength;
    use Data::Dumper;

    my $parser = Parse::FixedLength->new([
              field1 => '12R0:1:12',
              field2 => '2:13:14',
              field3 => '5R0:15:19',
              field4 => '10R0:20:29',
              field5 => '2R0:30:31',
              field6 => '3R0:32:34'
              ], {trim => '1'});

    $parser->{TPAD}[0] = qr/^0+(?=\d)/;   # Modification suggested by @bolav

    while (<DATA>) {
        warn "No record terminator found!\n" unless chomp;
        warn "Short Record!\n" unless $parser->length == length;
        my $data = $parser->parse($_);
        print Dumper $data;
    }
    __DATA__
    119401122910XX42152931177771001000
    119401122910XX42152931177771001010

最后一个字段应为010,但会输出blank10更新:我不希望field6的输出为000010 - 我只需删除trim即可选项。正则表达式应该解决这个问题,但由于某种原因它没有这样做。

$VAR1 = bless( {
                 'field1' => '119401122910',
                 'field6' => '',
                 'field4' => '9311777710',
                 'field2' => 'XX',
                 'field3' => '42152',
                 'field5' => '1'
               }, 'Parse::FixedLength::HashAsObj::Href1' );
$VAR1 = bless( {
                 'field1' => '119401122910',
                 'field6' => '10',
                 'field4' => '9311777710',
                 'field2' => 'XX',
                 'field3' => '42152',
                 'field5' => '1'
               }, 'Parse::FixedLength::HashAsObj::Href1' );

3 个答案:

答案 0 :(得分:2)

更新

好的,我已经阅读了你原来的问题,并且更好地了解你想要的是什么。你真的应该让每个问题都独立存在 - Stack Overflow不是一个论坛

建议的修改对此配置不起作用的原因是$parser->{TPAD}是要从每个对齐的<的前面删除的正则表达式的数组 / em>字段。在您的情况下,它是除field2以外的所有字段。您只修改数组的第一个元素,因此您只修复field1

这是一个更通用的修改,可以更改 $parser->{TPAD}数组的每个元素,这样它总是至少留下字段的最后一个字符,无论是什么。请注意,如果您的填充字符是5R等格式的空格,那么它会将全空间字段修剪为单个空格而不是将其清空

use strict;
use warnings;

use Parse::FixedLength;
use Data::Dump;

my $parser = Parse::FixedLength->new(
    [   field1 => '12R0:1:12',
        field2 => '2:13:14',
        field3 => '5R0:15:19',
        field4 => '10R0:20:29',
        field5 => '2R0:30:31',
        field6 => '3R0:32:34'
    ],
    { trim => 1 }
);

$_ = qr/$_(?=.)/ for @{ $parser->{TPAD} };

while (<DATA>) {
    my $data = $parser->parse($_);
    dd $data;
}

__DATA__
119401122910XX42152931177771001000
119401122910XX42152931177771001010

输出

bless({
  field1 => 119401122910,
  field2 => "XX",
  field3 => 42152,
  field4 => 9311777710,
  field5 => 1,
  field6 => 0,
}, "Parse::FixedLength::HashAsObj::Href1")
bless({
  field1 => 119401122910,
  field2 => "XX",
  field3 => 42152,
  field4 => 9311777710,
  field5 => 1,
  field6 => 10,
}, "Parse::FixedLength::HashAsObj::Href1")


我注意到,只需删除trim => 1选项,您的代码就会产生您想要的结果。但是我认为你有理由想要这样做,所以这里有一个解决方案

由于Parse::FixedLength允许使用pack模板元素,因此您可以明确指定A字段,以便按字面意义传输数据。它与模块用于其他字段的模板相同,但它会禁用该字段的trim选项

此代码按您的要求执行

    use strict;
    use warnings;

    use Parse::FixedLength;
    use Data::Dump;

    my $parser = Parse::FixedLength->new([
              field1 => '12R0:1:12',
              field2 => '2:13:14',
              field3 => '5R0:15:19',
              field4 => '10R0:20:29',
              field5 => 'A2:30:31',
              field6 => 'A3:32:34'
              ], {trim => '1'});

    while ( <DATA> ) {
        my $data = $parser->parse($_);
        dd $data;
    }

    __DATA__
    119401122910XX42152931177771001000
    119401122910XX42152931177771001010

输出

    bless({
      field1 => 119401122910,
      field2 => "XX",
      field3 => 42152,
      field4 => 9311777710,
      field5 => "01",
      field6 => "000",
    }, "Parse::FixedLength::HashAsObj::Href1")
    bless({
      field1 => 119401122910,
      field2 => "XX",
      field3 => 42152,
      field4 => 9311777710,
      field5 => "01",
      field6 => "010",
    }, "Parse::FixedLength::HashAsObj::Href1")

答案 1 :(得分:1)

您只需用0:

替换已修剪到骨骼的值
my $data = $parser->parse($_);

for my $val (values %$data) {
    $val =~ s/^$/0/  #If the val is blank, replace with a 0
}

以下是一个完整的例子:

use strict;
use warnings; 
use 5.020;

use Parse::FixedLength;
use Data::Dumper;
my $parser = Parse::FixedLength->new([
          field1 => '12R0:1:12',
          field2 => '2:13:14',
          field3 => '5R0:15:19',
          field4 => '10R0:20:29',
          field5 => '2R0:30:31',
          field6 => '3R0:32:34'
          ], {trim => '1'});

#$parser->{TPAD}[0] = qr/^0+(?=\d)/;   # Modification suggested by @bolav

while (<DATA>) {
    warn "No record terminator found!\n" unless chomp;
    warn "Short Record!\n" unless $parser->length == length;
    my $data = $parser->parse($_);
    for my $val (values %$data) {
        $val =~ s/^$/0/
    }
    #s/(\w+)/\u\L$1/g for @$data{qw(first_name last_name)};
    print Dumper $data;
}

__DATA__
119401122910XX42152931177771001000
119401122910XX42152931177771001010

输出:

$VAR1 = bless( {
                 'field1' => '119401122910',
                 'field5' => '1',
                 'field2' => 'XX',
                 'field3' => '42152',
                 'field4' => '9311777710',
                 'field6' => '0'
               }, 'Parse::FixedLength::HashAsObj::Href1' );
$VAR1 = bless( {
                 'field1' => '119401122910',
                 'field5' => '1',
                 'field2' => 'XX',
                 'field3' => '42152',
                 'field4' => '9311777710',
                 'field6' => '10'
               }, 'Parse::FixedLength::HashAsObj::Href1' );

答案 2 :(得分:1)

您可以直接使用unpack()

use strict;
use warnings; 
use 5.020;

use Data::Dumper;

=begin
print " ";

for my $i (1..3) {
    printf '%10s', $i;
}
print("\n");

say "0123456789" x 4;
say "119401122910XX42152931177771001000";

--output:--
          1         2         3
0123456789012345678901234567890123456789
119401122910XX42152931177771001000
=cut

# @12 => start at index position 12 in the record (0 based indexing)
# A5  => read 5 characters
my $pattern = <<'END_OF_PATTERN';

@0      A12 
@12     A2
@14     A5
@19     A10
@29     A2
@31     A3

END_OF_PATTERN


while (my $line = <DATA>) {

    my @fields = unpack $pattern, $line;

    for my $field (@fields[-2, -1]) {

        $field =~ s/
                        ^       #Match start of string, followed by...
                        0*      #A literal 0, zero or more times (greedy), followed by...
                        (\d+)   #A digit, one or more times, captured in group 1, followed by...
                        $       #The end of the string.
                  /$1/xms;   #Replace all the above with capture group 1.      
    }

    say Dumper @fields;
    say '-' x 10;
}


__DATA__
119401122910XX42152931177771001000
119401122910XX42152931177771001010

输出:

$VAR1 = '119401122910';
$VAR2 = 'XX';
$VAR3 = '42152';
$VAR4 = '9311777710';
$VAR5 = '1';
$VAR6 = '0';

----------
$VAR1 = '119401122910';
$VAR2 = 'XX';
$VAR3 = '42152';
$VAR4 = '9311777710';
$VAR5 = '1';
$VAR6 = '10';

----------

请注意,您可以从任何索引开始,可以重读部分记录等。例如:

@0  A10    #Start at index 0, read 10 characters
@0  A5     #Go back to index 0, read 5 charters
@20 A2     #Jump to index 20, read 2 characters
@18 A12    #Go back to index 18, read 12 characters