Perl - 打开文件 - 如果在最后一行之后没有显示换行符,则缺少最后一行

时间:2014-07-22 14:23:31

标签: perl file-io

您好,有人可以解释一下为什么我有两个脚本的不同输出:

01.pl

#!/usr/bin/perl
use warnings;
use strict;
use Data::Dumper;

open FDGROUP, "< file" or die "Can't open file: $!\n";
my @file = <FDGROUP>;
close FDGROUP;

@file = grep {/\S/} @file;

@file = grep {s/\r//} @file;
@file = grep {s/\n//} @file;

print Dumper @file;

02.pl

#!/usr/bin/perl
use warnings;
use strict;
use Data::Dumper;

open FDGROUP, "< file" or die "Can't open file: $!\n";
my @file = <FDGROUP>;
close FDGROUP;

@file = grep {/\S/} @file;

my $j = 0;
foreach (@file){
  $_ =~ s/\r//;
  $_ =~ s/\n//;
  $file[$j++] = $_;
}

print Dumper @file;

输出:

wakatana@azureus ~/scripts/stackoverflow
$ perl 01.pl
$VAR1 = '1';
$VAR2 = '2';
$VAR3 = '3';
$VAR4 = '4';
$VAR5 = '5';
$VAR6 = '6';

wakatana@azureus ~/scripts/stackoverflow
$ perl 02.pl
$VAR1 = '1';
$VAR2 = '2';
$VAR3 = '3';
$VAR4 = '4';
$VAR5 = '5';
$VAR6 = '6';
$VAR7 = '7';

wakatana@azureus ~/scripts/stackoverflow
$ od -ab file
0000000   1  cr  nl   2  cr  nl   3  cr  nl   4  cr  nl   5  cr  nl   6
        061 015 012 062 015 012 063 015 012 064 015 012 065 015 012 066
0000020  cr  nl   7
        015 012 067
0000023

wakatana@azureus ~/scripts/stackoverflow
$ perl -e 'print $/' | od -ab
0000000  nl
        012
0000001

当我在文件的最后一行之后添加另一个换行符时,我打开脚本会得到相同的结果(7个变量)。我知道chomp用于此类操作,但是当我使用以下脚本时:

#!/usr/bin/perl
use warnings;
use strict;
use Data::Dumper;

open FDGROUP, "< file" or die "Can't open file: $!\n";
my @file = <FDGROUP>;
close FDGROUP;

@file = grep {/\S/} @file;
chomp @file;
print Dumper @file;

我得到以下输出:

wakatana@azureus ~/scripts/stackoverflow
$ perl 03.pl
';AR1 = '1
';AR2 = '2
';AR3 = '3
';AR4 = '4
';AR5 = '5
';AR6 = '6
';AR7 = '7

可能这是由CR空白或其他东西引起的。

所有这些都是在cygwin下完成的。

由于

3 个答案:

答案 0 :(得分:2)

这些陈述:

@file = grep {/\S/} @file; # strips any element which doesn't have non-whitespace characters
@file = grep {s/\r//} @file; # strips any elem which doesn't have a \r, strips \r from those that do
@file = grep {s/\n//} @file; # strips any elem which doesn't have a \n, strips \n from those that do

每次您构建新阵列时。该新数组包含与grep匹配的{ block }输入的所有元素。

如果最后一行缺少\n,则会遗漏该行。

答案 1 :(得分:1)

Grep仅在与表达式匹配时才有效。最后一行没有\ n因此它不返回任何东西。

答案 2 :(得分:1)

与我的其他答案不同https://stackoverflow.com/a/24890193/3755747在技术上不是对你真正要求的答案......但你的代码是一种旧式的Perl,所以这里有一些更现代的替代品。< / p>

完全写出来,基本的Perl:

use strict;
use warnings;
use Data::Printer; # I prefer this over Data::Dumper

open( my $fh, '<', 'file' ) or die "can't open 'file': $!";

my @lines;
while ( my $line = <$fh> ) {
    $line =~ s/^(.*?)\r?\n?$/$1/;
    next if $line eq '';
    push @lines, $line;
}
close $fh or die "can't close 'file': $!";

p( @lines );

一个非常紧凑的版本,但有解释:

use strict;
use warnings;
use Data::Printer;

my @lines = grep {
    s/
        ^         # start of string
          (.*?)   # capture non-greedy match, without the ? it consumes the \r and \n as well
          \r? \n? # optional CR, optional LF
        $         # end of string
     /$1/x        # replace with the match, whitespace allowed in regex
    && length     # and string has to have some length remaining
} read_file( 'file' );

p( @lines );

不同的方式,使用split

use Modern::Perl '2012';
use File::Slurp;
use Data::Printer;

# added parenthesis around split arguments for clarity, they're not needed
my @lines = grep { length } split( /\r?\n/, read_file 'file' );
p( @lines );

在没有模块的情况下完全可以进行啜食:

use Modern::Perl;
use Data::Printer;

open( my $fh, '<', 'file' ) or die "can't open 'file': $!";
my @lines = grep { s/^(.*?)\r?\n?$/$1/ && length } <$fh>;
close $fh or die "can't close 'file': $!";

p( @lines );

我认为我更喜欢split版本。