我从How can I guess the encoding of a string in Perl?
找到了此示例脚本#!C:\perl\bin
use utf8;
use Encode qw(encode PERLQQ XMLCREF);
my $string = 'This year I went to 北京 Perl workshop.';
#print encode('ascii', $string, PERLQQ);
# This year I went to \x{5317}\x{4eac} Perl workshop.
print encode('ascii', $string, XMLCREF); # This year I went to 北京 Perl workshop.
进行测试后,我发现编码输出结果为:
This year I went to \x{71fa9} Perl workshop.
This year I went to 񱾩 Perl workshop.
看起来结果与上面示例代码中的一位作者不同。
我想知道如何编码字符串并以numeric character reference格式(&#xHHHH;
)输出,例如:
my $string = 'This year I went to 北京 Perl workshop.';
编码输出为:
This year I went to 北京 Perl workshop.
答案 0 :(得分:1)
答案 1 :(得分:0)
$string =~ s/[^\0-\377]/ sprintf '&#x%04x;', ord($&) /ge
查找$string
中不在0-255范围内的每个字符(即任何宽字符),并将其替换为表达式sprintf '&#x%04x;', ord($&)
的值,其中$&
是广泛的人物匹配。
use utf8;
$string = "This year I went to \x{5317}\x{4eac} Perl workshop.";
$string =~ s/[^\0-\377]/ sprintf '&#x%04x;', ord($&) /ge;
print $string;
产地:
This year I went to 北京 Perl workshop.