XSLT无法转换更大的XML文件

时间:2014-09-28 19:30:44

标签: xml xslt xpath xslt-1.0

我是XML和XSLT的新手,我想从XML文件中过滤一些信息。基于XML文件中某些标记值的匹配。当XML文件只包含1或2个Person标记信息时,我使用的解决方案。但是当处理具有更多Person信息的更大的xml文件时。它失败了,只有最后一个人会根据需要进行转换。

这是我的XML文件,如下所示:

<People>
<Person>
    <required-tag1>some-information</required-tag1>
    <required-tag2>some-information</required-tag2>
    <tag3>not important info</tag3>
    <tag4>not important info</tag4>
    <first-name>Mike</first-name>
    <last-name>Hewitt</last-name>
    <licenses>
        <license>
            <number>938387</number>
            <state xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">TX</state>
            <field xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">Health</field>
        </license>
        <license>
            <number>938387</number>
            <state xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">IL</state>
            <field xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">Health</field>
        </license>
    </licenses>
    <appointments>
        <appointment-info>
            <code>5124</code>
            <number>14920329324</number>
            <licensed-states>
                <state>TX</state>
            </licensed-states>
        </appointment-info>
    </appointments>
</Person>
<Person>
    <required-tag1>some-information</required-tag1>
    <required-tag2>some-information</required-tag2>
    <tag3>not important info</tag3>
    <tag4>not important info</tag4>
    <first-name>John</first-name>
    <last-name>Jhonny</last-name>
    <licenses>
        <license>
            <number>1762539</number>
            <state xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">TX</state>
            <field xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">Health</field>
        </license>
        <license>
            <number>1762539</number>
            <state xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">NY</state>
            <field xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">Health</field>
        </license>
    </licenses>
    <appointments>
        <appointment-info>
            <code>5124</code>
            <number>14920329324</number>
            <licensed-states>
                <state>TX</state>
            </licensed-states>
        </appointment-info>
    </appointments>
</Person>
    <Person>
    <required-tag1>some-information</required-tag1>
    <required-tag2>some-information</required-tag2>
    <tag3>not important info</tag3>
    <tag4>not important info</tag4>
    <first-name>Danny</first-name>
    <last-name>Hewitt</last-name>
    <licenses>
        <license>
            <number>17294083</number>
            <state xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">IL</state>
            <field xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">Health</field>
        </license>
    </licenses>
    <appointments>
        <appointment-info>
            <code>5124</code>
            <number>14920329324</number>
            <licensed-states>
                <state>IL</state>
            </licensed-states>
        </appointment-info>
    </appointments>
</Person>
<Person>
    <required-tag1>some-information</required-tag1>
    <required-tag2>some-information</required-tag2>
    <tag3>not important info</tag3>
    <tag4>not important info</tag4>
    <first-name>Russel</first-name>
    <last-name>Jhonny</last-name>
    <licenses>
        <license>
            <number>840790</number>
            <state xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">TX</state>
            <field xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">Health</field>
        </license>
        <license>
            <number>840790</number>
            <state xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">NY</state>
            <field xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">Health</field>
        </license>
        <license>
            <number>840790</number>
            <state xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">CA</state>
            <field xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">Health</field>
        </license>
    </licenses>
    <appointments>
        <appointment-info>
            <code>5124</code>
            <number>14920329324</number>
            <licensed-states>
                <state>TX</state>
                <state>NY</state>
            </licensed-states>
        </appointment-info>
    </appointments>
</Person>
</People>

我基本上想做的是,如果一个人在一个州获得许可,例如TX。并且在该状态下具有预约信息,例如TX,从许可证中过滤掉。如果这是唯一的许可证信息,则过滤该人员。

新xml应包含所需标签的信息。并且只有与预约许可证中的许可证不匹配的许可证才会声明并过滤匹配所有许可证的人员。

这是我期待的输出:

<People>
<Person>
    <required-tag1>some-information</required-tag1>
    <required-tag2>some-information</required-tag2>
    <first-name>Mike</first-name>
    <last-name>Hewitt</last-name>
    <licenses>
        <license>
            <number>938387</number>
            <state xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">IL</state>
            <field xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">Health</field>
        </license>
    </licenses>
</Person>
<Person>
    <required-tag1>some-information</required-tag1>
    <required-tag2>some-information</required-tag2>
    <first-name>John</first-name>
    <last-name>Jhonny</last-name>
    <licenses>
        <license>
            <number>1762539</number>
            <state xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">NY</state>
            <field xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">Health</field>
        </license>
    </licenses>
</Person>
<Person>
    <required-tag1>some-information</required-tag1>
    <required-tag2>some-information</required-tag2>
    <first-name>John</first-name>
    <last-name>Jhonny</last-name>
    <licenses>
        <license>
            <number>840790</number>
            <state xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">CA</state>
            <field xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">Health</field>
        </license>
    </licenses>
</Person>
</People>

过滤了与该州的所有许可证匹配的第三方人员。目前我只是在示例中使用一个状态,但如果有多个状态,它应该能够过滤该信息。

如何编写XSLT来过滤此信息。我正在使用XSLT版本1.0

目前,我可以应用此XSLT来获取转换所需的标记。但我不知道如何过滤许可状态,它适用于较小的文件,但是当我处理更大的文件时失败。如果有人可以指导我,我将非常感激,因为我不知道出了什么问题以及它在哪里失败。

这是我使用的XSLT,如下所示:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output indent="yes"/>
<xsl:strip-space elements="*"/>

<!--Identity transform (aka identity template). This will match
and copy attributes and nodes (element, text, comment and
processing-instruction) without changing them. Unless a more
specific template matches, everything will get handled by this
template.-->    
<xsl:template match="@*|node()">
    <xsl:copy>
        <xsl:apply-templates select="@*|node()"/>
    </xsl:copy>
</xsl:template>

<!--This template will match the "Person" element node. The "xsl:copy"
creates the new "Person" element. The "xsl:apply-templates" tells
the processor to apply templates to any attributes (of Person) or
elements listed in the "select". (Other elements will not be 
processed.) I used the union operator in the "select" so I wouldn't
have to write multiple "xsl:apply-templates".-->
<xsl:template match="Person">
    <xsl:copy>
        <xsl:apply-templates select="@*|first-name|last-name|
            required-tag1|required-tag2|licenses"/>
    </xsl:copy>
</xsl:template>

<!--This template will match any "license" element nodes that have a child 
"state" element whose value matches a "state" element node that is a 
child of "licensed-states". Since the "xsl:template" is empty, nothing 
is output or processed further.-->
<xsl:template match="license[state=//licensed-states/state]"/>

</xsl:stylesheet>

这就是我得到的输出,这是错误的。

<?xml version="1.0" encoding="UTF-8"?>
<People>
<Person>
  <required-tag1>some-information</required-tag1>
  <required-tag2>some-information</required-tag2>
  <first-name>Mike</first-name>
  <last-name>Hewitt</last-name>
  <licenses/>
</Person>
<Person>
   <required-tag1>some-information</required-tag1>
  <required-tag2>some-information</required-tag2>
  <first-name>John</first-name>
  <last-name>Jhonny</last-name>
  <licenses/>
</Person>
<Person>
  <required-tag1>some-information</required-tag1>
  <required-tag2>some-information</required-tag2>
  <first-name>Danny</first-name>
  <last-name>Hewitt</last-name>
  <licenses/>
</Person>
<Person>
  <required-tag1>some-information</required-tag1>
  <required-tag2>some-information</required-tag2>
  <first-name>Russel</first-name>
  <last-name>Jhonny</last-name>
  <licenses>
     <license>
        <number>840790</number>
        <state xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">CA</state>
        <field xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">Health</field>
     </license>
  </licenses>
</Person>
</People>

我只是不知道出了什么问题,因为当我从xml文件中删除最后两个人的信息并使用相同的XSLT进行测试时,它完美无缺。我不知道如何删除匹配所有许可证的人员的信息。

3 个答案:

答案 0 :(得分:1)

一个明显的问题:

'state = // licensed-states / state'将检查文档中的所有state,而不仅仅是特定于此用户的<xsl:template match="license[state=ancestor::Person//licensed-states/state]"/> 。而不是从根目录搜索整个文档(这是路径前面的//),给出从这个状态到你想要检查的区域的相对路径。至少,你需要说你只在同一个人中寻找:

<xsl:template match="license[state=ancestor::Person/appointments/appointment-info/licensed-states/state]"/>

更快的表现是更明确地给出相对路径:

<xsl:template match="license[state=../../appointments/appointment-info/licensed-states/state]"/>

或者,因为您知道Person比许可证高两级,

..

其中parent::*是{{1}}的缩写。

答案 1 :(得分:0)

如果要在XML的另一部分中查找项目,请考虑使用xsl:key来执行此操作。在您的情况下,您希望查找某人的许可状态。这需要更多的努力,因为您需要使用连锁密钥,其中包含Personstate值的唯一标识符

<xsl:key name="state" match="licensed-states/state" use="concat(generate-id(ancestor::Person), '|', .)" /

generate-id()是一个为节点生成唯一ID的函数。 (如果XML中有Person的某个“id”属性或元素,则可以使用该属性或元素。

现在,您想要排除所有州都有约会的人。为此,您需要将其设为双重否定,并排除所有不具有约会状态的人

<xsl:template match="Person[not(licenses/license[not(key('state', concat(generate-id(ancestor::Person), '|', state)))])]"/>

在州内排除许可证稍微简单

<xsl:template match="license[key('state', concat(generate-id(ancestor::Person), '|', state))]"/>

试试这个XSLT

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="yes"/>
<xsl:strip-space elements="*"/>

<xsl:key name="state" match="licensed-states/state" use="concat(generate-id(ancestor::Person), '|', .)" />

<xsl:template match="@*|node()">
    <xsl:copy>
        <xsl:apply-templates select="@*|node()"/>
    </xsl:copy>
</xsl:template>

<xsl:template match="Person[not(licenses/license[not(key('state', concat(generate-id(ancestor::Person), '|', state)))])]"/>

<xsl:template match="license[key('state', concat(generate-id(ancestor::Person), '|', state))]"/>

<xsl:template match="appointments" />

</xsl:stylesheet>

另请注意我是如何移除xsl:apply-templates的特定代码,例如firstname,而是使用<xsl:template match="appointments" />来排除appointments,因此{{1}的所有子节点复制除Person以外的其他内容。

答案 2 :(得分:0)

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output indent="yes"/>
<xsl:strip-space elements="*"/>

<!--Identity transform (aka identity template). This will match
and copy attributes and nodes (element, text, comment and
processing-instruction) without changing them. Unless a more
specific template matches, everything will get handled by this
template.-->    
<xsl:template match="@*|node()">
    <xsl:copy>
        <xsl:apply-templates select="@*|node()"/>
    </xsl:copy>
</xsl:template>

<!--This template will match the "Person" element node. The "xsl:copy"
creates the new "Person" element. The "xsl:apply-templates" tells
the processor to apply templates to any attributes (of Person) or
elements listed in the "select". (Other elements will not be 
processed.) I used the union operator in the "select" so I wouldn't
have to write multiple "xsl:apply-templates".-->
<xsl:template match="Person">
    <xsl:copy>
        <xsl:apply-templates select="@*|first-name|last-name|
            required-tag1|required-tag2|licenses"/>
    </xsl:copy>
</xsl:template>

<!--This template will match any "license" element nodes that have a child 
"state" element whose value matches a "state" element node that is a 
child of "licensed-states". 
This template will also match the "Person" element node if the number of
"state" elements that don't have a corresponding "licensed-state"
is equal to zero. ("filtered person who matched all licenses"
requirement.)
Since the "xsl:template" is empty, nothing 
is output or processed further.-->
<xsl:template match="license[state=../..//licensed-states/state]|
Person[count(licenses/license[not(state=../..//licensed-states/state)])=0]"/>

</xsl:stylesheet>