如何用&取代� 在perl?

时间:2016-04-12 15:00:01

标签: regex perl

我有一个外部窗口应用程序调用PERL脚本,其中包含一个包含�的字符串。我试图检测每个这样的实例,并用&。

替换它

我尝试使用所有这些,但没有任何作用:

$line =~ s/\uFFFD/&/g;
$line =~ s/&#65533/&/g;
$line =~ s/\x{fffd}/&/g;
$line =~ s/\xfffd/&/g;

1 个答案:

答案 0 :(得分:0)

As ikegami pointed out in their comment, the third solution ($line =~ s/\x{fffd}/&/g;) is the correct one. If it does not work, one of your assumptions has to be wrong. Two possibilities come to my mind:

  • Your input does not contain an actual Unicode replacement character. It may be the case that your editor renders another sequence of bytes in the same way. You can check it by running hexdump -c. If your input is encoded in UTF-8 (mind the difference between UTF8 and Unicode), you should see the following sequence: 357 277 275.
  • You did not inform perl about the input text encoding. Perl assumes it is a one-byte encoding and thus a regex containing a multibyte character will never match. Please compare the following:

    echo '�' | perl -pe 's/\x{fffd}/&/'

    echo '�' | perl -CS -pe 's/\x{fffd}/&/'

    &