我正在尝试保存此网页,该网页似乎已将cp1256编码为UTF-8编码格式的文本文件,如果我尝试在保存之前替换html实体,则出现问题،通过其阿拉伯字符“,”保存的文件内容不再是阿拉伯语。
#!C:\perl\bin\perl.exe
use Encode;
use LWP::Simple;
binmode STDOUT, ':encoding(UTF-8)';
my $url = qq{https://www.altafsir.com/Tafasir.asp?tMadhNo=1&tTafsirNo=7&tSoraNo=1&tAyahNo=1&tDisplay=yes&UserProfile=0&LanguageId=1};
my $content = get($url);
$content = decode('cp1256', $content);
my $ch = chr(0x60c);
# this line causes the problem
$content =~ s/\،\;/$ch/mg;
open File, ">filecontent.txt" or die "Error creating file.\n";
binmode File, ':encoding(UTF-8)';
print File $content;
close File;
exit;