通过XSLT表传递XML会导致重复的节点

时间:2016-11-09 17:56:06

标签: python xml xslt lxml elementtree

我有一个很长的XML文档,我正在用Python编辑。这是一个每个月生成的文档,然后手动编辑,我试图至少自动化一些过程。第一部分工作正常。但是,当我通过XSLT文档(change())运行文档通过最终重新排序更改时,我按原始顺序获取重新排序的元素和原始元素,我不知道为什么。我原以为是因为我一遍又一遍地重写同一个文件,但是在change()运行之后才会出现重复项。所以我认为这与我如何使用XSLT有关,但我是一个真正的初学者。因此,非常感谢你想拍摄我的任何帮助。

 from __future__ import print_function

from lxml import etree
import xml.etree.ElementTree as et    

def adultSmash():
        def adultGrab(): #grab all adult events
                        src_tree = et.parse('quartertwo.xml') 
                        src_root = src_tree.getroot()
                        dest_tree = et.parse('dest_tree.xml') 
                        dest_root = dest_tree.getroot()
                        for event in src_root.findall('event'):
                                        agerange = event.find('AgeRanges') 
                                        if agerange is None: 
                                                        continue
                                        ageranges = agerange.text
                                        if ageranges == 'Adult':
                                                        dest_root.append(event)
                        et.ElementTree(dest_root).write('dest_tree.xml') 

        def clean(): 
                        dest_tree = et.parse('dest_tree.xml')
                        dest_root = dest_tree.getroot()
                        for event in dest_root.findall('event'):
                                book = event.find('EventType') #
                                books = book.text
                                if books == 'Book Groups':
                                        dest_root.remove(event)
                                elif books == 'Book Sales':
                                        dest_root.remove(event)
                                elif books == 'Bookmobile Stop':
                                        dest_root.remove(event)
                        et.ElementTree(dest_root).write('dest_tree.xml')

        def cleanNodes(): 
                        dest_tree = et.parse('dest_tree.xml')
                        dest_root = dest_tree.getroot()

                        foos = dest_tree.findall('event')
                        for event in foos:
                                bars = event.findall('Notes') 
                                for Notes in bars: 
                                        event.remove(Notes)
                        et.ElementTree(dest_root).write('dest_tree.xml')

        def change():               
                dom = et.parse('dest_tree.xml')
                xslt = et.parse('change.xslt')
                transform = et.XSLT(xslt)
                newdom = transform(dom)
                log = open('dest_tree.xml', 'w')
                print(str(newdom), file = log)
    adultGrab()
    clean()
    cleanNodes()
    change()

这是XML

<?xml version="1.0" encoding="utf-8"?>
<events>
  <event>
    <EventType>Blah</EventType>
    <title>Blah Blah</title>
    <RelatedLocations>Blah</RelatedLocations>
    <Date>Friday, September 2, 2016</Date>
    <DateYear>2016</DateYear>
    <DateMonth>09</DateMonth>
    <DateDay>02</DateDay>
    <Body>Derp</Body>
    <Notes>Notes are not displayed to the public.</Notes>
  </event>
  <event>
    <EventType>Blah</EventType>
    <title>Blah Blah</title>
    <RelatedLocations>Blah</RelatedLocations>
    <Date>Friday, September 2, 2016</Date>
    <DateYear>2016</DateYear>
    <DateMonth>09</DateMonth>
    <DateDay>02</DateDay>
    <Body>Derp</Body>
    <Notes>Notes are not displayed to the public.</Notes>
  </event>
</events>

这是我用来改变它的XSLT:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output encoding="UTF-8" indent="yes" method="xml" />
    <xsl:strip-space elements="*"/>
    <xsl:template match="node()|@*">
            <xsl:copy>
                    <xsl:apply-templates select="node()|@*"/>
            </xsl:copy>
    </xsl:template>
    <xsl:template match="event">
            <xsl:copy>
                    <xsl:apply-templates select="@*" />
                    <xsl:apply-templates select="title" />
                    <xsl:apply-templates select="RelatedLocations" />
                    <xsl:apply-templates select="Date" />
                    <xsl:apply-templates select="DateYear" />
                    <xsl:apply-templates select="DateMonth" />
                    <xsl:apply-templates select="DateDay" />
                    <xsl:apply-templates select="Body" />
                    <xsl:apply-templates select="AgeRanges" />
                    <xsl:apply-templates select="*[not(self::Location or self::EventType)]" />
            </xsl:copy>
    </xsl:template>

最后这是结果:

<?xml version="1.0" encoding="UTF-8"?>
<events>
  <event>
    <title>Blah</title>
    <RelatedLocations>Derp</RelatedLocations>
    <Date>Every Saturday through Nov 30 2016. Saturday, October 1, 2016 - 10 a.m.-5 p.m.</Date>
    <DateYear>2016</DateYear>
    <DateMonth>10</DateMonth>
    <DateDay>01</DateDay>
<Body>Blah</Body>
<AgeRanges>Adult</AgeRanges>
<AgeRanges>Adult</AgeRanges>
<title></title>
<RelatedLocations>Blah</RelatedLocations>
<Date>Every Saturday through Nov 30 2016. Saturday, October 1, 2016 - 10 a.m.-5 p.m.</Date>
<DateYear>2016</DateYear>
<DateMonth>10</DateMonth>
<DateDay>01</DateDay>
<Body>Blah</Body>

所以任何帮助都会受到赞赏。

1 个答案:

答案 0 :(得分:0)

您在输出中收到重复的节点,因为您要将模板应用于相同的节点两次。例如,你这样做:

<xsl:apply-templates select="title" />

然后:

<xsl:apply-templates select="*[not(self::Location or self::EventType)]" />

title元素既不是Location也不是EventType,因此第二条指令再次应用模板。