这就是角色,我无法找到一种方法来检测,替换或正确地将其写入XML文件。起初我使用字符串连接,然后我明白了XML :: Writer,但它仍然无法工作,之后XML仍然被破坏。(需要UTF-8)
这是我写的一个测试仍然打破:
my $output = new IO::File(">$foundFilePath");
my $writer = new XML::Writer(OUTPUT => $output);
$writer->xmlDecl("UTF-8");
$writer->startTag("xml");
$writer->startTag("test");
$writer->characters("’");
$writer->endTag("test");
$writer->endTag("xml");
$writer->end();
$output->close();
更具体地说,我正在尝试从此页面获取数据:http://investing.businessweek.com/businessweek/research/stocks/private/snapshot.asp?privcapId=4439466
威廉·奥基夫先生正在弄乱一切。
答案 0 :(得分:3)
您需要做两件事。如果要将UTF-8写入文件,则需要这样说:
my $output = IO::File->new($foundFilePath, ">:utf8");
如果您想在源代码中使用文字UTF-8字符串,则需要说
use utf8;
在您的计划开始时。否则,Perl假定您的源代码是Latin-1。
这是一个完整的示例脚本:
use utf8;
use strict;
use warnings;
use IO::File;
use XML::Writer;
my $foundFilePath = 'test.xml';
my $output = IO::File->new($foundFilePath, ">:utf8");
my $writer = XML::Writer->new(OUTPUT => $output);
$writer->xmlDecl("UTF-8");
$writer->startTag("xml");
$writer->startTag("test");
$writer->characters("’");
$writer->endTag("test");
$writer->endTag("xml");
$writer->end();
$output->close();