使用XSL从XML创建CSV

时间:2016-05-21 03:31:52

标签: xml xslt xml-parsing xslt-1.0 xslt-grouping

我正在尝试从GNUCash XML文件中提取客户数据。最终目标是获取一个逗号分隔值文本文件,显示在XML数据中找到的客户数据。

我使用的命令是

xsltproc transform.xsl data.xml

我的xml输入文件(可以从https://od.lk/d/OV8xMTAwMjI3ODBf/data.xml下载完整文件):

. . .
    <gnc:GncCustomer version="2.0.0">
      <cust:guid type="guid">a9dbdd1ec6fddad5e068bfa22f5ad93d</cust:guid>
      <cust:name>Customer2</cust:name>
      <cust:id>000102</cust:id>
      <cust:addr version="2.0.0">
        <addr:name>Billing Name</addr:name>
        <addr:addr1>Billing Address  Line 1</addr:addr1>
        <addr:addr2>Billing Address  Line 2</addr:addr2>
        <addr:addr3>Billing Address  Line 3</addr:addr3>
        <addr:addr4>Billing Address  Line 4</addr:addr4>
        <addr:phone>Phone</addr:phone>
        <addr:fax>Fax</addr:fax>
        <addr:email>E-mail</addr:email>
      </cust:addr>
      <cust:shipaddr version="2.0.0">
        <addr:name>Shipping Name</addr:name>
        <addr:addr1>Shipping Address Line 1</addr:addr1>
        <addr:addr2>Shipping Address Line 2</addr:addr2>
        <addr:addr3>Shipping Address Line 3</addr:addr3>
        <addr:addr4>Shipping Address Line 4</addr:addr4>
        <addr:phone>Shipping phone</addr:phone>
        <addr:fax>Shipping fax</addr:fax>
        <addr:email>Shipping e-mail</addr:email>
      </cust:shipaddr>
      <cust:notes>Notes</cust:notes>
      <cust:taxincluded>USEGLOBAL</cust:taxincluded>
      <cust:active>1</cust:active>
      <cust:discount>0/1</cust:discount>
      <cust:credit>0/1</cust:credit>
      <cust:currency>
        <cmdty:space>ISO4217</cmdty:space>
        <cmdty:id>USD</cmdty:id>
      </cust:currency>
      <cust:use-tt>1</cust:use-tt>
    </gnc:GncCustomer>
. . .

我尝试的XSL文件(也可以从https://od.lk/d/OV8xMTAwMjI5MTBf/transform.xsl下载):

<xsl:stylesheet version="1.0" 
    xmlns="http://www.gnucash.org/XML/"
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:cd="http://www.gnucash.org/XML/cd"
    xmlns:book="http://www.gnucash.org/XML/book"
    xmlns:gnc="http://www.gnucash.org/XML/gnc"
    xmlns:cmdty="http://www.gnucash.org/XML/cmdty"
    xmlns:trn="http://www.gnucash.org/XML/trn"
    xmlns:split="http://www.gnucash.org/XML/split"
    xmlns:act="http://www.gnucash.org/XML/act"
    xmlns:price="http://www.gnucash.org/XML/price"
    xmlns:ts="http://www.gnucash.org/XML/ts"
    xmlns:slot="http://www.gnucash.org/XML/kvpslot"
    xmlns:cust="http://www.gnucash.org/XML/cust"
    xmlns:entry="http://www.gnucash.org/XML/entry"
    xmlns:lot="http://www.gnucash.org/XML/lot"
    xmlns:invoice="http://www.gnucash.org/XML/invoice"
    xmlns:owner="http://www.gnucash.org/XML/owner"
    xmlns:job="http://www.gnucash.org/XML/job"
    xmlns:billterm="http://www.gnucash.org/XML/billterm"
    xmlns:bt-days="http://www.gnucash.org/XML/bt-days"
    xmlns:sx="http://www.gnucash.org/XML/sx"
    xmlns:fs="http://www.gnucash.org/XML/fs"
    xmlns:addr="http://www.gnucash.org/XML/custaddr">
<!-- The above were taken from https://www.gnucash.org/docs/v2.6/C/gnucash-guide/appendixa_xmlconvert1.html -->

  <xsl:output method="text" encoding="utf-8"/>

  <!-- Use the <xsl:strip-space> instruction to eliminate all white-space-only text nodes from the source XML document. -->
  <xsl:strip-space elements="*"/>

  <xsl:param name="separator">,</xsl:param>

  <xsl:param name="quote">"</xsl:param>

  <xsl:param name="newline">&#10;</xsl:param>

  <xsl:template match="/">
    <xsl:apply-templates/>
  </xsl:template>

  <xsl:template match="gnc-v2|gnc:book|gnc-account-example">
    <xsl:apply-templates/>
  </xsl:template>

  <xsl:template match="gnc:GncCustomer">
    <xsl:apply-templates select="cust:id"/>
    <xsl:value-of select="$separator"/>
    <xsl:apply-templates select="cust:guid"/>
    <xsl:value-of select="$separator"/>    
    <xsl:apply-templates select="cust:name"/>
    <xsl:value-of select="$separator"/>
    <xsl:apply-templates select="cust:addr/addr:name"/>
    <xsl:value-of select="$separator"/>
    <xsl:apply-templates select="cust:addr/addr:addr1"/>
    <xsl:value-of select="$separator"/>
    <xsl:apply-templates select="cust:addr/addr:addr2"/>
    <xsl:value-of select="$separator"/>
    <xsl:apply-templates select="cust:addr/addr:addr3"/>
    <xsl:value-of select="$separator"/>
    <xsl:apply-templates select="cust:addr/addr:addr4"/>
    <xsl:value-of select="$separator"/>
    <xsl:apply-templates select="cust:addr/addr:phone"/>
    <xsl:value-of select="$separator"/>
    <xsl:apply-templates select="cust:addr/addr:fax"/>
    <xsl:value-of select="$separator"/>
    <xsl:apply-templates select="cust:addr/addr:email"/>
    <xsl:value-of select="$separator"/>
    <xsl:apply-templates select="cust:shipaddr/addr:name"/>
    <xsl:value-of select="$separator"/>
    <xsl:apply-templates select="cust:shipaddr/addr:addr1"/>
    <xsl:value-of select="$separator"/>
    <xsl:apply-templates select="cust:shipaddr/addr:addr2"/>
    <xsl:value-of select="$separator"/>
    <xsl:apply-templates select="cust:shipaddr/addr:addr3"/>
    <xsl:value-of select="$separator"/>
    <xsl:apply-templates select="cust:shipaddr/addr:addr4"/>
    <xsl:value-of select="$separator"/>
    <xsl:apply-templates select="cust:shipaddr/addr:phone"/>
    <xsl:value-of select="$separator"/>
    <xsl:apply-templates select="cust:shipaddr/addr:fax"/>
    <xsl:value-of select="$separator"/>
    <xsl:apply-templates select="cust:shipaddr/addr:email"/>
    <xsl:value-of select="$separator"/>
    <xsl:apply-templates select="cust:notes"/>
    <!-- We somehow need to deal with the possibility that the user typed a line feed character in the notes field. -->
    <xsl:value-of select="$separator"/>
    <xsl:apply-templates select="cust:taxincluded"/>
    <xsl:value-of select="$separator"/>
    <xsl:apply-templates select="cust:active"/>
    <xsl:value-of select="$separator"/>
    <xsl:apply-templates select="cust:discount"/>
    <xsl:value-of select="$separator"/>
    <xsl:apply-templates select="cust:credit"/>
    <xsl:value-of select="$separator"/>
    <xsl:apply-templates select="cust:currency/cmdty:space"/>
    <xsl:value-of select="$separator"/>
    <xsl:apply-templates select="cust:currency/cmdty:id"/>
    <xsl:value-of select="$separator"/>
    <xsl:apply-templates select="cust:use-tt"/>
    <xsl:text>&#10;</xsl:text>
  </xsl:template>

  <xsl:template match="cust:id|cust:guid|cust:name|cust:addr/addr:name|cust:addr/addr:addr1|cust:addr/addr:addr2|cust:addr/addr:addr3|cust:addr/addr:addr4|cust:addr/addr:phone|cust:addr/addr:fax|cust:addr/addr:email|cust:shipaddr/addr:name|cust:shipaddr/addr:addr1|cust:shipaddr/addr:addr2|cust:shipaddr/addr:addr3|cust:shipaddr/addr:addr4|cust:shipaddr/addr:phone|cust:shipaddr/addr:fax|cust:shipaddr/addr:email|cust:notes|cust:taxincluded|cust:active|cust:discount|cust:credit|cust:currency/cmdty:space|cust:currency/cmdty:id|cust:use-tt">
    <xsl:value-of select="$quote"/>
    <xsl:value-of select="."/>
    <xsl:value-of select="$quote"/>
  </xsl:template>

  <xsl:template match="*"/>

</xsl:stylesheet>

结果我正在显示我没有下降到所有子元素节点。特别是,我无法显示结算明细。可能是我的XPath表达式错了?

例如,对于Customer2,连续的逗号显示我缺少结算数据:

"000102","a9dbdd1ec6fddad5e068bfa22f5ad93d","Customer2",,,,,,,,,,,,,,,,,"Notes","USEGLOBAL","1","0/1","0/1","ISO4217","USD","1"

我想得到的结果是所有节点的XML数据的逗号分隔值列表,例如:

"000102","a9dbdd1ec6fddad5e068bfa22f5ad93d","Customer2","Billing Name","Billing Address  Line 1","Billing Address  Line 2","Billing Address  Line 3","Billing Address  Line 4","Phone","Fax","E-mail","Shipping Name","Shipping Address Line 1","Shipping Address Line 2","Shipping Address Line 3","Shipping Address Line 4","Shipping phone","Shipping fax","Shipping e-mail","Notes","USEGLOBAL","1","0/1","0/1","ISO4217","USD","1"
  

可能有助于确定XSL适用于 cust:currency 但不适用于 cust:addr cust:shipaddr

1 个答案:

答案 0 :(得分:0)

每当输出丢失时,我总是首先查看命名空间。在这种情况下,xsl中addr的命名空间声明为:

的xmlns:ADDR = “http://www.gnucash.org/XML/custaddr”

然而,在xml文件中,它被声明为:

的xmlns:ADDR = “http://www.gnucash.org/XML/addr”

将它们设置为相同的值,我认为一切都应该很好。

顺便说一句,如果您这样编写,那么最终模板会更加整洁:

<xsl:template match="*[ancestor::gnc:GncCustomer]" >
    <xsl:value-of select="concat('&quot;', . , '&quot;')"/>
</xsl:template>