IO :: File-> open()似乎不尊重在以下程序中使用open(),这对我来说很奇怪,似乎是违反文档的。或许我做错了。重写我的代码以不使用IO :: File应该不难。
我希望输出为
$VAR1 = \"Hello \x{213} (r-caret)";
Hello ȓ (r-caret)
Hello ȓ (r-caret)
Hello ȓ (r-caret)
但是我收到了这个错误:“糟糕:在./run.pl第33行的打印中,格式错误的UTF-8字符(字符串意外结束)。”
这根本不适合我。
#!/usr/local/bin/perl
use utf8;
use v5.16;
use strict;
use warnings;
use warnings qw(FATAL utf8);
use diagnostics;
use open qw(:std :utf8);
use charnames qw(:full :short);
use File::Basename;
my $application = basename $0;
use Data::Dumper;
$Data::Dumper::Indent = 1;
use Try::Tiny;
my $str = "Hello ȓ (r-caret)";
say Dumper(\$str);
open(my $fh, '<', \$str);
print while ($_ = $fh->getc());
close($fh);
print "\n";
try {
use IO::File;
my $fh = IO::File->new();
$fh->open(\$str, '<');
print while ($_ = $fh->getc());
$fh->close();
print "\n";
}
catch {
say "\nOops: $_";
};
try {
use IO::File;
my $fh = IO::File->new();
$fh->open(\$str, '<:encoding(UTF-8)');
print while ($_ = $fh->getc());
$fh->close();
print "\n";
}
catch {
say "\nOops: $_";
};
答案 0 :(得分:7)
我相信这里发生的是use open
是一个词汇编译指示,这意味着它只会影响同一词法范围内对open()
的调用。词法范围是代码在同一块中的时间。 IO::File->open
是open()
的包装,因此在其词法范围之外调用open()
。
{
use open;
...same lexical scope...
{
...inner lexical scope...
...inherits from the outer...
}
...still the same lexical scope...
foo();
}
sub foo {
...outside "use open"'s lexical scope...
}
在上面的示例中,即使在foo()
的词法范围内调用use open
,foo()
内的代码也在外面,因此不受其影响。
如果IO :: File继承open.pm,那将是礼貌的。这不是微不足道的,而是可能的。类似的问题困扰着autodie。 It was fixed并且修复可能在IO :: File中起作用。
答案 1 :(得分:3)
[这不是一个答案,而是一个不适合评论的错误的通知。]
文件只能包含字节。 $str
包含非字节值。因此,
open(my $fh, '<', \$str)
毫无意义。它应该是
open(my $fh, '<', \encode_utf8($str))
use utf8;
use v5.16;
use strict;
use warnings;
use warnings qw(FATAL utf8);
use open qw( :std :utf8 );
use Encode qw( encode_utf8 );
use Data::Dumper qw( Dumper );
sub dump_str {
local $Data::Dumper::Useqq = 1;
local $Data::Dumper::Terse = 1;
local $Data::Dumper::Indent = 0;
return Dumper($_[0]);
}
for my $encode (0..1) {
for my $orig ("\x{213}", "\x{C9}", substr("\x{C9}\x{213}", 0, 1)) {
my $file_ref = $encode ? \encode_utf8($orig) : \$orig;
my $got = eval { open(my $fh, '<', $file_ref); <$fh> };
printf("%-10s %-6s %-9s => %-10s => %s\n",
$encode ? "bytes" : "codepoints",
defined($got) && $orig eq $got ? "ok" : "not ok",
dump_str($orig),
dump_str($$file_ref),
defined($got) ? dump_str($got) : 'DIED',
);
}
}
输出:
codepoints ok "\x{213}" => "\x{213}" => "\x{213}"
codepoints not ok "\311" => "\311" => DIED
codepoints not ok "\x{c9}" => "\x{c9}" => DIED
bytes ok "\x{213}" => "\310\223" => "\x{213}"
bytes ok "\311" => "\303\211" => "\x{c9}"
bytes ok "\x{c9}" => "\303\211" => "\x{c9}"