通过XSLT将XML文档转换为包含其属性和数据的CSV

时间:2014-07-29 10:04:00

标签: xml csv xslt

我想知道如何通过使用XSLT将XML文档转换为关于其属性值的CSV格式?

提前致谢。

这是我的XML文档:

<?xml version="1.0"?>
<catalog>
   <book id="bk101">
      <author>Gambardella, Matthew</author>
      <title>XML Developer's Guide</title>
      <genre>Computer</genre>
      <price>44.95</price>
      <publish_date>2000-10-01</publish_date>
      <description>An in-depth look at creating applications 
      with XML.</description>
   </book>
   <book id="bk102">
      <author>Ralls, Kim</author>
      <title>Midnight Rain</title>
      <genre>Fantasy</genre>
      <price>5.95</price>
      <publish_date>2000-12-16</publish_date>
      <description>A former architect battles corporate zombies, 
      an evil sorceress, and her own childhood to become queen 
      of the world.</description>
   </book>
   <book id="bk103">
      <author>Corets, Eva</author>
      <title>Maeve Ascendant</title>
      <genre>Fantasy</genre>
      <price>5.95</price>
      <publish_date>2000-11-17</publish_date>
      <description>After the collapse of a nanotechnology 
      society in England, the young survivors lay the 
      foundation for a new society.</description>
   </book>
   <book id="bk104">
      <author>Corets, Eva</author>
      <title>Oberon's Legacy</title>
      <genre>Fantasy</genre>
      <price>5.95</price>
      <publish_date>2001-03-10</publish_date>
      <description>In post-apocalypse England, the mysterious 
      agent known only as Oberon helps to create a new life 
      for the inhabitants of London. Sequel to Maeve 
      Ascendant.</description>
   </book>
   <book id="bk105">
      <author>Corets, Eva</author>
      <title>The Sundered Grail</title>
      <genre>Fantasy</genre>
      <price>5.95</price>
      <publish_date>2001-09-10</publish_date>
      <description>The two daughters of Maeve, half-sisters, 
      battle one another for control of England. Sequel to 
      Oberon's Legacy.</description>
   </book>
   <book id="bk106">
      <author>Randall, Cynthia</author>
      <title>Lover Birds</title>
      <genre>Romance</genre>
      <price>4.95</price>
      <publish_date>2000-09-02</publish_date>
      <description>When Carla meets Paul at an ornithology 
      conference, tempers fly as feathers get ruffled.</description>
   </book>
   <book id="bk107">
      <author>Thurman, Paula</author>
      <title>Splish Splash</title>
      <genre>Romance</genre>
      <price>4.95</price>
      <publish_date>2000-11-02</publish_date>
      <description>A deep sea diver finds true love twenty 
      thousand leagues beneath the sea.</description>
   </book>
   <book id="bk108">
      <author>Knorr, Stefan</author>
      <title>Creepy Crawlies</title>
      <genre>Horror</genre>
      <price>4.95</price>
      <publish_date>2000-12-06</publish_date>
      <description>An anthology of horror stories about roaches,
      centipedes, scorpions  and other insects.</description>
   </book>
   <book id="bk109">
      <author>Kress, Peter</author>
      <title>Paradox Lost</title>
      <genre>Science Fiction</genre>
      <price>6.95</price>
      <publish_date>2000-11-02</publish_date>
      <description>After an inadvertant trip through a Heisenberg
      Uncertainty Device, James Salway discovers the problems 
      of being quantum.</description>
   </book>
   <book id="bk110">
      <author>O'Brien, Tim</author>
      <title>Microsoft .NET: The Programming Bible</title>
      <genre>Computer</genre>
      <price>36.95</price>
      <publish_date>2000-12-09</publish_date>
      <description>Microsoft's .NET initiative is explored in 
      detail in this deep programmer's reference.</description>
   </book>
   <book id="bk111">
      <author>O'Brien, Tim</author>
      <title>MSXML3: A Comprehensive Guide</title>
      <genre>Computer</genre>
      <price>36.95</price>
      <publish_date>2000-12-01</publish_date>
      <description>The Microsoft MSXML3 parser is covered in 
      detail, with attention to XML DOM interfaces, XSLT processing, 
      SAX and more.</description>
   </book>
   <book id="bk112">
      <author>Galos, Mike</author>
      <title>Visual Studio 7: A Comprehensive Guide</title>
      <genre>Computer</genre>
      <price>49.95</price>
      <publish_date>2001-04-16</publish_date>
      <description>Microsoft Visual Studio 7 is explored in depth,
      looking at how Visual Basic, Visual C++, C#, and ASP+ are 
      integrated into a comprehensive development 
      environment.</description>
   </book>
</catalog>

从第一个子元素开始所有元素&#39;如果需要通过分隔逗号打印值及其属性

2 个答案:

答案 0 :(得分:2)

虽然mhawke的答案很好,但它不是很灵活。例如,如果你想标准化值的空间(我想你想要那个,因为你当前的输入,字段将跨越多行,这通常不支持CSV),你必须包装每一个领域。此外,如果您引入新字段,则必须更改样式表。

通常,在XSLT中,您使用模板,这使得扩展和扩展更加灵活。以下样式表适用于您的输入。我添加了normalize-space函数xsl:strip-space来删除源文档中不相关的空格,当然,它也输出了属性。

如果你有多个属性,你也可以在属性上应用模板,以使其更灵活。

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:strip-space elements="*"/>
    <xsl:output method='text'/>

    <xsl:template match="/">        
        <xsl:apply-templates select="catalog/book"/>
    </xsl:template>

    <!-- matches a book-record -->
    <xsl:template match="book">
        <xsl:value-of select="@id"/>
        <xsl:text>,</xsl:text>
        <xsl:apply-templates select="*" />
        <xsl:text>&#xA;</xsl:text>
    </xsl:template>

    <!-- matches any field (inside book record) -->
    <xsl:template match="*">
        <xsl:text>"</xsl:text>
        <!-- use normalize-space, otherwise fields will span multiple lines -->
        <xsl:value-of select="normalize-space(.)" />
        <xsl:text>"</xsl:text>

        <!-- no comma after the last field -->
        <xsl:if test="following-sibling::*">,</xsl:if>
    </xsl:template>
</xsl:stylesheet>

输入后,它会生成此输出(注意,我故意没有引用ID值):

bk101,"Gambardella, Matthew","XML Developer's Guide","Computer","44.95","2000-10-01","An in-depth look at creating applications with XML."
bk102,"Ralls, Kim","Midnight Rain","Fantasy","5.95","2000-12-16","A former architect battles corporate zombies, an evil sorceress, and her own childhood to become queen of the world."
bk103,"Corets, Eva","Maeve Ascendant","Fantasy","5.95","2000-11-17","After the collapse of a nanotechnology society in England, the young survivors lay the foundation for a new society."
bk104,"Corets, Eva","Oberon's Legacy","Fantasy","5.95","2001-03-10","In post-apocalypse England, the mysterious agent known only as Oberon helps to create a new life for the inhabitants of London. Sequel to Maeve Ascendant."
bk105,"Corets, Eva","The Sundered Grail","Fantasy","5.95","2001-09-10","The two daughters of Maeve, half-sisters, battle one another for control of England. Sequel to Oberon's Legacy."
bk106,"Randall, Cynthia","Lover Birds","Romance","4.95","2000-09-02","When Carla meets Paul at an ornithology conference, tempers fly as feathers get ruffled."
bk107,"Thurman, Paula","Splish Splash","Romance","4.95","2000-11-02","A deep sea diver finds true love twenty thousand leagues beneath the sea."
bk108,"Knorr, Stefan","Creepy Crawlies","Horror","4.95","2000-12-06","An anthology of horror stories about roaches, centipedes, scorpions and other insects."
bk109,"Kress, Peter","Paradox Lost","Science Fiction","6.95","2000-11-02","After an inadvertant trip through a Heisenberg Uncertainty Device, James Salway discovers the problems of being quantum."
bk110,"O'Brien, Tim","Microsoft .NET: The Programming Bible","Computer","36.95","2000-12-09","Microsoft's .NET initiative is explored in detail in this deep programmer's reference."
bk111,"O'Brien, Tim","MSXML3: A Comprehensive Guide","Computer","36.95","2000-12-01","The Microsoft MSXML3 parser is covered in detail, with attention to XML DOM interfaces, XSLT processing, SAX and more."
bk112,"Galos, Mike","Visual Studio 7: A Comprehensive Guide","Computer","49.95","2001-04-16","Microsoft Visual Studio 7 is explored in depth, looking at how Visual Basic, Visual C++, C#, and ASP+ are integrated into a comprehensive development environment."

而且,作为上述解决方案的替代方案,您可以考虑引用那些实际需要引用的字段(因为它们中有逗号)。这个例子还说明了当你使用基于模板的处理(这就是XSLT的全部内容)时,为什么扩展这么简单:

只需用这两个模板替换最后一个模板:

<!-- with a comma in the field, quote it -->
<xsl:template match="*[contains(text(), ',')]">
    <xsl:text>"</xsl:text>
    <!-- use normalize-space, otherwise fields will span multiple lines -->
    <xsl:value-of select="normalize-space(.)" />
    <xsl:text>"</xsl:text>

    <!-- no comma after the last field -->
    <xsl:if test="following-sibling::*">,</xsl:if>
</xsl:template>

<!-- without a comma -->
<xsl:template match="*">
    <xsl:value-of select="normalize-space(.)" />
    <xsl:if test="following-sibling::*">,</xsl:if>
</xsl:template>

第一行输出现在看起来像这样(注意第3个等字段周围故意缺少引号):

bk101,"Gambardella, Matthew",XML Developer's Guide,Computer,44.95,2000-10-01,An in-depth look at creating applications with XML.

答案 1 :(得分:1)

您可以使用&#34; @ attribute&#34;来访问属性值。在xsl:value-of中,例如

<xsl:value-of select="@id"/>

由于您要生成CSV,因此您需要考虑您的数据包含逗号(例如,在<author>, <title><description>标记中),并且您需要引用这些字段。或者,您可以选择使用不会出现在数据中的其他分隔符。引用等的一些信息是here

这是一个XSL样式表,可以满足您的需求:

<?xml version="1.0" encoding="ISO-8859-1"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:output method='text'/>

<xsl:template match="/">
  <xsl:for-each select="catalog/book">
    <xsl:value-of select="@id"/>,"<xsl:value-of select="author"/>","<xsl:value-of select="title"/>",<xsl:value-of select="genre"/>,<xsl:value-of select="price"/>,<xsl:value-of select="publish_date"/>,"<xsl:value-of select="description"/>"

  </xsl:for-each>
</xsl:template>
</xsl:stylesheet>