我有一个xml文件,其标题如下:
<!ENTITY nbsp " "><!-- no-break space = non-breaking space,
U+00A0 ISOnum -->
<!ENTITY iexcl "¡"><!-- inverted exclamation mark, U+00A1 ISOnum -->
<!ENTITY cent "¢"><!-- cent sign, U+00A2 ISOnum -->
<!ENTITY pound "£"><!-- pound sign, U+00A3 ISOnum -->
<!ENTITY curren "¤"><!-- currency sign, U+00A4 ISOnum -->
<!ENTITY yen "¥"><!-- yen sign = yuan sign, U+00A5 ISOnum -->
<!ENTITY brvbar "¦"><!-- broken bar = broken vertical bar,
U+00A6 ISOnum -->
<!ENTITY sect "§"><!-- section sign, U+00A7 ISOnum -->
<!ENTITY uml "¨"><!-- diaeresis = spacing diaeresis,
U+00A8 ISOdia -->
<!ENTITY copy "©"><!-- copyright sign, U+00A9 ISOnum -->
<!ENTITY ordf "ª"><!-- feminine ordinal indicator, U+00AA ISOnum -->
<!ENTITY laquo "«"><!-- left-pointing double angle quotation mark
= left pointing guillemet, U+00AB ISOnum -->
<!ENTITY not "¬"><!-- not sign, U+00AC ISOnum -->
<!ENTITY shy "­"><!-- soft hyphen = discretionary hyphen,
U+00AD ISOnum -->
<!ENTITY reg "®"><!-- registered sign = registered trade mark sign,
U+00AE ISOnum -->
<!ENTITY macr "¯"><!-- macron = spacing macron = overline
= APL overbar, U+00AF ISOdia -->
<!ENTITY deg "°"><!-- degree sign, U+00B0 ISOnum -->
<!ENTITY plusmn "±"><!-- plus-minus sign = plus-or-minus sign,
U+00B1 ISOnum -->
当我尝试将其加载到dom文档中时,它似乎不会将其保存到文件中。我认为上面的代码可能导致解析错误。有没有办法删除这些标题?
这是我的PHP代码:
$xml = curl_exec($ch);
$srcDom = new DOMDocument;
$srcDom->load($xml);
$xPath = new DOMXPath($srcDom);
foreach ($srcDom->getElementsByTagName('Venue') as $venue) {
$dstDom = new DOMDocument('1.0', 'utf-8');
$dstDom->appendChild($dstDom->createElement('EventsPricePoints'));
$dstDom->documentElement->appendChild($dstDom->importNode($venue, true));
$allEventsForVenue = $xPath->query(
sprintf(
'/Store/EventsPricePoints/Event[VenueID/@ID=%d]',
$venue->getAttribute('ID')
)
);
foreach ($allEventsForVenue as $event) {
$dstDom->documentElement->appendChild($dstDom->importNode($event, true));
}
$dstDom->formatOutput = true;
$dstDom->saveXml(sprintf('/var/www/html/venuexml/%d.xml', $venue->getAttribute('ID')));
}
答案 0 :(得分:0)
您可能对strip_tags感兴趣,但您需要将所有合法代码列入白名单。
答案 1 :(得分:0)
您的代码不会导致解析错误(很可能不会,如果您启用了错误记录或报告,您可能已经看到了警告,但我认为不是这样)。
相反,您的代码加载并且默认情况下XML是UTF-8编码的,您使用的所有实体都不必传输,因为XML可以包含这些实体的字符而不需要这些实体。
因此,XML中的定义以及实体本身都是多余的。我想DOMDocument
会删除它们。
此外,如果您已经为测试目的提供了一个示例XML块,那么您可以获得更具体的答案来满足您的澄清需求。