我有一个外部窗口应用程序调用PERL脚本,其中包含一个包含�的字符串。我试图检测每个这样的实例,并用&。
替换它我尝试使用所有这些,但没有任何作用:
$line =~ s/\uFFFD/&/g;
$line =~ s/�/&/g;
$line =~ s/\x{fffd}/&/g;
$line =~ s/\xfffd/&/g;
答案 0 :(得分:0)
As ikegami pointed out in their comment, the third solution ($line =~ s/\x{fffd}/&/g;
) is the correct one. If it does not work, one of your assumptions has to be wrong. Two possibilities come to my mind:
hexdump -c
. If your input is encoded in UTF-8 (mind the difference between UTF8 and Unicode), you should see the following sequence: 357 277 275
.You did not inform perl about the input text encoding. Perl assumes it is a one-byte encoding and thus a regex containing a multibyte character will never match. Please compare the following:
echo '�' | perl -pe 's/\x{fffd}/&/'
�
echo '�' | perl -CS -pe 's/\x{fffd}/&/'
&