在R中解析缩小的XML字符串

时间:2015-12-14 13:39:10

标签: xml r analytics

我想使用R,

以树格式打印缩小的xml字符串

我写了以下代码:

> ex  <- "<note><to>Tove</to><from>Jani</from><heading>Reminder</heading><body>Don't forget me this weekend!</body></note>"
> 
> XML::xmlParse(ex)
<?xml version="1.0"?>
<note>
  <to>Tove</to>
  <from>Jani</from>
  <heading>Reminder</heading>
  <body>Don't forget me this weekend!</body>
</note>

它工作正常,但如果我尝试这个例子的其他XML字符串,如:

ex2 <- "<?xml version="1.0" encoding="UTF-8" ?><!DOCTYPE foo SYSTEM "Foo.dtd"><a><b>bbb</b><c/><d><soapenv:Envelope xmlns:soapenv="http://xxx" xmlns:xsd="http://yyy" xmlns:xsi="http://zzz"></soapenv></d><e><![CDATA[ <z></z> ]]></e><f><g></g></f></a>"
**Error: unexpected numeric constant in "ex2 <- "<?xml version="1.0"**

请建议我如何才能完成这项工作。

我需要如下结果:

<?xml version="1.0" encoding="UTF-8" ?>
<!DOCTYPE foo SYSTEM "Foo.dtd">
<a>
    <b>bbb</b>
    <c/>
    <d>
        <soapenv:Envelope 
            xmlns:soapenv="http://xxx" 
            xmlns:xsd="http://yyy" 
            xmlns:xsi="http://zzz">
        </soapenv>
    </d>
    <e>
        <![CDATA[ <z></z> ]]>
    </e>
    <f>
        <g></g>
    </f>
</a>

请建议可以应用的属性忽略标签不匹配..!

示例XML:

<?xml version="1.0"?><catalog><book id="bk101"><author>Gambardella, Matthew</author><title>XML Developer's Guide</title><genre>Computer</genre><price>44.95</price><publish_date>2000-10-01</publish_date><description>An in-depth look at creating applications 
      with XML.</description></book><book id="bk102"><author>Ralls, Kim</author><title>Midnight Rain</title><genre>Fantasy</genre><price>5.95</price><publish_date>2000-12-16</publish_date><description>A former architect battles corporate zombies, 
      an evil sorceress, and her own childhood to become queen 
      of the world.</description></book><book id="bk103"><author>Corets, Eva</author><title>Maeve Ascendant</title><genre>Fantasy</genre><price>5.95</price><publish_date>2000-11-17</publish_date><description>After the collapse of a nanotechnology 
      society in England, the young survivors lay the 
      foundation for a new society.</description></book><book id="bk104"><author>Corets, Eva</author><title>Oberon's Legacy</title><genre>Fantasy</genre><price>5.95</price><publish_date>2001-03-10</publish_date><description>In post-apocalypse England, the mysterious 
      agent known only as Oberon helps to create a new life 
      for the inhabitants of London. Sequel to Maeve 
      Ascendant.</description></book><book id="bk105"><author>Corets, Eva</author><title>The Sundered Grail</title><genre>Fantasy</genre><price>5.95</price><publish_date>2001-09-10</publish_date><description>The two daughters of Maeve, half-sisters, 
      battle one another for control of England. Sequel to 
      Oberon's Legacy.</description></book><book id="bk106"><author>Randall, Cynthia</author><title>Lover Birds</title><genre>Romance</genre><price>4.95</price><publish_date>2000-09-02</publish_date><description>When Carla meets Paul at an ornithology 
      conference, tempers fly as feathers get ruffled.</description></book><book id="bk107"><author>Thurman, Paula</author><title>Splish Splash</title><genre>Romance</genre><price>4.95</price><publish_date>2000-11-02</publish_date><description>A deep sea diver finds true love twenty 
      thousand leagues beneath the sea.</description></book><book id="bk108"><author>Knorr, Stefan</author><title>Creepy Crawlies</title><genre>Horror</genre><price>4.95</price><publish_date>2000-12-06</publish_date><description>An anthology of horror stories about roaches,
      centipedes, scorpions  and other insects.</description></book><book id="bk109"><author>Kress, Peter</author><title>Paradox Lost</title><genre>Science Fiction</genre><price>6.95</price><publish_date>2000-11-02</publish_date><description>After an inadvertant trip through a Heisenberg
      Uncertainty Device, James Salway discovers the problems 
      of being quantum.</description></book><book id="bk110"><author>O'Brien, Tim</author><title>Microsoft .NET: The Programming Bible</title><genre>Computer</genre><price>36.95</price><publish_date>2000-12-09</publish_date><description>Microsoft's .NET initiative is explored in 
      detail in this deep programmer's reference.</description></book><book id="bk111"><author>O'Brien, Tim</author><title>MSXML3: A Comprehensive Guide</title><genre>Computer</genre><price>36.95</price><publish_date>2000-12-01</publish_date><description>The Microsoft MSXML3 parser is covered in 
      detail, with attention to XML DOM interfaces, XSLT processing, 
      SAX and more.</description></book><book id="bk112"><author>Galos, Mike</author><title>Visual Studio 7: A Comprehensive Guide</title><genre>Computer</genre><price>49.95</price><publish_date>2001-04-16</publish_date><description>Microsoft Visual Studio 7 is explored in depth,
      looking at how Visual Basic, Visual C++, C#, and ASP+ are 
      integrated into a comprehensive development 
      environment.</description></book></catalog>

1 个答案:

答案 0 :(得分:0)

ex2 <- "<?xml version='1.0' encoding='UTF-8' ?><!DOCTYPE foo SYSTEM 'Foo.dtd'><a><b>bbb</b><c/><d><soapenv:Envelope xmlns:soapenv='http://xxx' xmlns:xsd='http://yyy' xmlns:xsi='http://zzz'></soapenv:Envelope></d><e><![CDATA[ <z></z> ]]></e><f><g></g></f></a>"

XML::xmlParse(ex2)

用/ soapenv替换/ soapenv:信封。

错误消息可以自我解释。