转义多字节字符

时间:2018-11-08 15:20:58

标签: perl unicode utf-8 escaping unicode-escapes

使用Python-我可以获取一个字符串并以多字节字符返回 UTF-8逃脱了:

$ python3 -c 'print("hello ☺ world".encode("utf-8"))'
b'hello \xe2\x98\xba world'

或者逃脱了unicode:

$ python3 -c 'print("hello ☺ world".encode("unicode-escape"))'
b'hello \\u263a world'

Perl可以做这样的事情吗?我尝试了“ quotemeta”,但似乎不是 正确的工具:

$ perl -e 'print quotemeta("hello ☺ world\n");'
hello\ \�\�\�\ world\

1 个答案:

答案 0 :(得分:13)

Data::Dumper可以做到这一点。

use utf8;
use Encode;
use Data::Dumper;
$Data::Dumper::Terse = 1;   # suppress  '$VAR1 = ...' header
$Data::Dumper::Useqq = 1;   # make output printable

print Dumper("hello ☺ world");
print Dumper(encode("UTF-8","hello ☺ world"));

输出:

"hello \x{263a} world"
"hello \342\230\272 world"

更新Data::Dumper模块中的相关功能为qquote,因此您可以跳过设置$Useqq$Terse

use utf8;
use Encode;
use Data::Dumper;

print Data::Dumper::qquote("hello ☺ world"), "\n";
print Data::Dumper::qquote(encode("UTF-8","hello ☺ world")), "\n";