PHP DOMDocument没有正确格式化输出

时间:2011-09-02 13:33:28

标签: php xml simplexml domdocument

我目前正在研究网站的站点地图,我正在使用SimpleXML导入并对原始XML文件进行一些检查。在此之后,我使用simplexml_load_file("small.xml");将其转换为DOMDocument,以便更容易精确地添加和操作XML元素。下面是我正在使用的测试XML站点地图:

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
  <url>
    <loc>http://www.companycheck.co.uk/searches/2011/08/22/23:52:32-Orouke.html</loc>
    <lastmod>2011-08-23</lastmod>
  </url>
  <url>
    <loc>http://www.companycheck.co.uk/searches/2011/08/22/23:53:23-castle technology.html</loc>
    <lastmod>2011-08-23</lastmod>
  </url>
  <url>
    <loc>http://www.companycheck.co.uk/searches/2011/08/22/23:53:38-banana split.html</loc>
    <lastmod>2011-08-23</lastmod>
  </url>
  <url>
    <loc>http://www.companycheck.co.uk/searches/2011/08/22/23:53:42-Waveney.html</loc>
    <lastmod>2011-08-23</lastmod>
  </url>
  <url>
    <loc>http://www.companycheck.co.uk/searches/2011/08/22/23:55:12-pure orange.html</loc>
    <lastmod>2011-08-23</lastmod>
  </url>
  <url>
    <loc>http://www.companycheck.co.uk/searches/2011/08/22/23:57:54-tau press.html</loc>
    <lastmod>2011-08-23</lastmod>
  </url>
  <url>
    <loc>http://www.companycheck.co.uk/searches/2011/08/22/23:59:21-E.f.m.html</loc>
    <lastmod>2011-08-23</lastmod>
  </url>
  <url>
    <loc>http://www.companycheck.co.uk/searches/2011/08/22/23:59:31-apple.html</loc>
    <lastmod>2011-08-23</lastmod>
  </url>
  <url>
    <loc>http://www.companycheck.co.uk/searches/2011/08/22/23:59:45-townhouse communications.html</loc>
    <lastmod>2011-08-23</lastmod>
  </url>
</urlset>

现在。这是我用来修改的测试代码:

<?php

$root = simplexml_load_file("small.xml");

$domRoot = dom_import_simplexml($root);

$dom = $domRoot->ownerDocument;

$urlElement = $dom->createElement("url");

    $locElement = $dom->createElement("loc");

        $locElement->appendChild($dom->createTextNode("www.google.co.uk"));

    $urlElement->appendChild($locElement);

    $lastmodElement = $dom->createElement("lastmod");

        $lastmodElement->appendChild($dom->createTextNode("2011-08-02"));

    $urlElement->appendChild($lastmodElement);

$domRoot->appendChild($urlElement);

$dom->formatOutput = true;
echo $dom->saveXML();

?>

主要问题是,无论我放置$dom->formatOutput = true;的位置,从SimpleXML导入的现有XML格式是否正确,但是所有新内容都采用“all one line”样式格式化,如下所示:

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
  <url>
    <loc>http://www.companycheck.co.uk/searches/2011/08/22/23:52:32-Orouke.html</loc>
    <lastmod>2011-08-23</lastmod>
  </url>
  <url>
    <loc>http://www.companycheck.co.uk/searches/2011/08/22/23:53:23-castle technology.html</loc>
    <lastmod>2011-08-23</lastmod>
  </url>
  <url>
    <loc>http://www.companycheck.co.uk/searches/2011/08/22/23:53:38-banana split.html</loc>
    <lastmod>2011-08-23</lastmod>
  </url>
  <url>
    <loc>http://www.companycheck.co.uk/searches/2011/08/22/23:53:42-Waveney.html</loc>
    <lastmod>2011-08-23</lastmod>
  </url>
  <url>
    <loc>http://www.companycheck.co.uk/searches/2011/08/22/23:55:12-pure orange.html</loc>
    <lastmod>2011-08-23</lastmod>
  </url>
  <url>
    <loc>http://www.companycheck.co.uk/searches/2011/08/22/23:57:54-tau press.html</loc>
    <lastmod>2011-08-23</lastmod>
  </url>
  <url>
    <loc>http://www.companycheck.co.uk/searches/2011/08/22/23:59:21-E.f.m.html</loc>
    <lastmod>2011-08-23</lastmod>
  </url>
  <url>
    <loc>http://www.companycheck.co.uk/searches/2011/08/22/23:59:31-apple.html</loc>
    <lastmod>2011-08-23</lastmod>
  </url>
  <url>
    <loc>http://www.companycheck.co.uk/searches/2011/08/22/23:59:45-townhouse communications.html</loc>
    <lastmod>2011-08-23</lastmod>
  </url>
<url><loc>www.google.co.uk</loc><lastmod>2011-08-02</lastmod></url></urlset>

如果有人知道为什么会这样,以及如何解决这个问题,我将非常感激。

2 个答案:

答案 0 :(得分:1)

有一种解决方法。您可以先将新xml保存为字符串强制重新格式化,然后在设置formatOutput属性后再次加载它,例如:

$strXml = $dom->saveXML();
$dom->formatOutput = true;
$dom->loadXML($strXml);
echo $dom->saveXML();

答案 1 :(得分:0)

要很好地格式化输出,您需要在加载之前将 preserveWhiteSpace 变量设置为 false,如 documentation

中所述

示例:

$Xhtml = "<div><span></span></div>";
$doc = new DOMDocument('1.0','UTF-8');
$doc->preserveWhiteSpace = false;
$doc->formatOutput = true;
$doc->loadXML($Xhtml);
$formattedXhtml = $doc->saveXML($doc->documentElement, LIBXML_NOXMLDECL);
$expectedFormatting =<<<EOF
<div>
  <span/>
</div>
EOF;
$this->assertEquals($expectedFormatting,$formattedXhtml,"The XHTML is formatted");   

只为来到这里的访问者,因为这是 Google 搜索上的第一个答案。