Question

我正在尝试将unicode字符作为perl脚本中的参数：

C:\>perl test.pl ö

#----
# test.pl
#----
#!/usr/bin/perl
use warnings;
use strict;

my ($name, $number) = @ARGV;

if (not defined $name) {
    die "Need name\n";
}

if (defined $number) {
    print "Save '$name' and '$number'\n";
    # save name/number in database
    exit;
}

if ($name eq 'ö') {
    print "Fetch umlaut 'oe'\n";
} elsif ($name eq 'o') {
    print "Fetch simple 'o'\n";
} else {
    print "Fetch other '$name'\n";
}

print "ü";

我得到了输出：

Fetch simple 'o'
ü

我已经在python 3中测试了代码（算法）并且它有效，所以我得到了“ö”。但显然在perl中还有一些我必须添加或设置的内容。无论是Strawberry Perl还是ActiveState Perl都没关系。我得到了相同的结果。

提前致谢！

Answer 1

#!/usr/bin/perl

use strict;
use warnings;

my $encoding_in;
my $encoding_out;
my $encoding_sys;
BEGIN {
    require Win32;

    $encoding_in  = 'cp' . Win32::GetConsoleCP();
    $encoding_out = 'cp' . Win32::GetConsoleOutputCP();
    $encoding_sys = 'cp' . Win32::GetACP();

    binmode(STDIN,  ":encoding($encoding_in)");
    binmode(STDOUT, ":encoding($encoding_out)");
    binmode(STDERR, ":encoding($encoding_out)");
}

use Encode qw( decode );

{
    my ($name, $number) = map { decode($encoding_sys, $_) } @ARGV;

    if (not defined $name) {
        die "Need name\n";
    }

    if (defined $number) {
        print "Save '$name' and '$number'\n";
        # save name/number in database
        exit;
    }

    if ($name eq 'ö') {
        print "Fetch umlaut 'oe'\n";
    } elsif ($name eq 'o') {
        print "Fetch simple 'o'\n";
    } else {
        print "Fetch other '$name'\n";
    }

    print "ü";
}

此外，您应添加use feature qw( unicode_strings );和/或使用UTF-8对文件进行编码并添加use utf8;。

Answer 2

除了ikagami的好答案之外，我还是Encode::Locale模块的粉丝，可自动为当前控制台的代码页创建别名。它适用于Win32，OS X＆amp;其他口味的* nix。

#!/usr/bin/perl

use strict;
use warnings;

# These two lines make life better when you leave the world of ASCII
# Just remember to *save* the file as UTF8....
use utf8;
use feature 'unicode_strings';

use Encode::Locale 'decode_argv';         # We'll use the console_in & console_out aliases as well as decode_argv().
use Encode;

binmode(STDIN,  ":encoding(console_in)");
binmode(STDOUT, ":encoding(console_out)");
binmode(STDERR, ":encoding(console_out)");

decode_argv( );   # Decode ARGV in place
my ($name, $number) = @ARGV;

if (not defined $name) {
    die "Need name\n";
}

if (defined $number) {
    print "Save '$name' and '$number'\n";
    # save name/number in database
    exit;
}

if ($name eq 'ö') {
    print "Fetch umlaut 'oe'\n";
} elsif ($name eq 'o') {
    print "Fetch simple 'o'\n";
} else {
    print "Fetch other '$name'\n";
}

print "ü";

也许它只是语法糖，但它易于阅读并促进跨平台兼容性。

Answer 3

我认为该问题的代码答案很明确，但并不完整：

那样，考虑到所有代码页和源代码编码，构造脚本非常复杂，此外，使其变得可移植也更加困难：ö拉丁字母用户已知，但是の或렌也存在...
它们的可能在特定代码页中使用char可以正常运行，但是在其外面使用char时，它们将失败（注释中的某些用户可能就是这种情况）。请注意，Windows' Code Pages are previous to Unicode。
基本问题是Windows理解的Perl 5 没有使用Unicode支持编译：Windows理解它：它只是linux代码的一部分，因此，几乎所有Unicode字符在他们到达Perl代码之前就受到了破坏。

A提供了更长的技术说明（和C补丁！）。 Sinan Unur 的页面 Fixing Perl's Unicode problems on the command line on Windows: A trilogy in N parts （在Artistic License 2.0下）。

因此（但并非出于胆怯）： perl.exe 的重新编译是可能的，并且在Windows中几乎完全兼容Unicode。希望有一天它们会被集成到源代码中……直到他们恢复some detailed instructions to patch perl.exe here。

还请注意，需要具有完全Unicode支持的正确命令控制台。一种快速的解决方案是使用ConEmu，但使用Windows的 cmd.exe could also work after some heavy tweaks。

Answer 4

我不知道这是否是针对特殊情况的解决方案，但是我可以通过在调用脚本时使用参数“ -CAS”来摆脱困境。

示例：

脚本_1：

use strict;
use utf8;

$|++; # Prevent buffering issues


my ($arg) = @ARGV;
save_to_file('test.txt', $arg);

sub save_to_file{   

    my ($filename, $content) = @_;

    open(my $fh, '>:encoding(UTF-8)', $filename) or die "Can't open < $filename: $!";;
    print $fh $content;
    close $fh;

    return;
}

Script_2调用1：

use strict;
use utf8;


execute_command();

sub execute_command {


    my $command = "perl -CAS simple_utf_string.pl äääöööü";

    # Execute command
    print "The command to run is: $command\n";
    open my $command_pipe, "-|:encoding(UTF-8)", $command or die "Pipe from $command failed: $!";
    while (<$command_pipe>) {
        print  $_;
    }
}

结果：text.txt：

äääöööü

来自控制台（@ARGV）/ Windows /的Perl unicode支持

4 个答案: