我有一个很长的XML文档,我正在用Python编辑。这是一个每个月生成的文档,然后手动编辑,我试图至少自动化一些过程。第一部分工作正常。但是,当我通过XSLT文档(change())运行文档通过最终重新排序更改时,我按原始顺序获取重新排序的元素和原始元素,我不知道为什么。我原以为是因为我一遍又一遍地重写同一个文件,但是在change()运行之后才会出现重复项。所以我认为这与我如何使用XSLT有关,但我是一个真正的初学者。因此,非常感谢你想拍摄我的任何帮助。
from __future__ import print_function
from lxml import etree
import xml.etree.ElementTree as et
def adultSmash():
def adultGrab(): #grab all adult events
src_tree = et.parse('quartertwo.xml')
src_root = src_tree.getroot()
dest_tree = et.parse('dest_tree.xml')
dest_root = dest_tree.getroot()
for event in src_root.findall('event'):
agerange = event.find('AgeRanges')
if agerange is None:
continue
ageranges = agerange.text
if ageranges == 'Adult':
dest_root.append(event)
et.ElementTree(dest_root).write('dest_tree.xml')
def clean():
dest_tree = et.parse('dest_tree.xml')
dest_root = dest_tree.getroot()
for event in dest_root.findall('event'):
book = event.find('EventType') #
books = book.text
if books == 'Book Groups':
dest_root.remove(event)
elif books == 'Book Sales':
dest_root.remove(event)
elif books == 'Bookmobile Stop':
dest_root.remove(event)
et.ElementTree(dest_root).write('dest_tree.xml')
def cleanNodes():
dest_tree = et.parse('dest_tree.xml')
dest_root = dest_tree.getroot()
foos = dest_tree.findall('event')
for event in foos:
bars = event.findall('Notes')
for Notes in bars:
event.remove(Notes)
et.ElementTree(dest_root).write('dest_tree.xml')
def change():
dom = et.parse('dest_tree.xml')
xslt = et.parse('change.xslt')
transform = et.XSLT(xslt)
newdom = transform(dom)
log = open('dest_tree.xml', 'w')
print(str(newdom), file = log)
adultGrab()
clean()
cleanNodes()
change()
这是XML
<?xml version="1.0" encoding="utf-8"?>
<events>
<event>
<EventType>Blah</EventType>
<title>Blah Blah</title>
<RelatedLocations>Blah</RelatedLocations>
<Date>Friday, September 2, 2016</Date>
<DateYear>2016</DateYear>
<DateMonth>09</DateMonth>
<DateDay>02</DateDay>
<Body>Derp</Body>
<Notes>Notes are not displayed to the public.</Notes>
</event>
<event>
<EventType>Blah</EventType>
<title>Blah Blah</title>
<RelatedLocations>Blah</RelatedLocations>
<Date>Friday, September 2, 2016</Date>
<DateYear>2016</DateYear>
<DateMonth>09</DateMonth>
<DateDay>02</DateDay>
<Body>Derp</Body>
<Notes>Notes are not displayed to the public.</Notes>
</event>
</events>
这是我用来改变它的XSLT:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output encoding="UTF-8" indent="yes" method="xml" />
<xsl:strip-space elements="*"/>
<xsl:template match="node()|@*">
<xsl:copy>
<xsl:apply-templates select="node()|@*"/>
</xsl:copy>
</xsl:template>
<xsl:template match="event">
<xsl:copy>
<xsl:apply-templates select="@*" />
<xsl:apply-templates select="title" />
<xsl:apply-templates select="RelatedLocations" />
<xsl:apply-templates select="Date" />
<xsl:apply-templates select="DateYear" />
<xsl:apply-templates select="DateMonth" />
<xsl:apply-templates select="DateDay" />
<xsl:apply-templates select="Body" />
<xsl:apply-templates select="AgeRanges" />
<xsl:apply-templates select="*[not(self::Location or self::EventType)]" />
</xsl:copy>
</xsl:template>
最后这是结果:
<?xml version="1.0" encoding="UTF-8"?>
<events>
<event>
<title>Blah</title>
<RelatedLocations>Derp</RelatedLocations>
<Date>Every Saturday through Nov 30 2016. Saturday, October 1, 2016 - 10 a.m.-5 p.m.</Date>
<DateYear>2016</DateYear>
<DateMonth>10</DateMonth>
<DateDay>01</DateDay>
<Body>Blah</Body>
<AgeRanges>Adult</AgeRanges>
<AgeRanges>Adult</AgeRanges>
<title></title>
<RelatedLocations>Blah</RelatedLocations>
<Date>Every Saturday through Nov 30 2016. Saturday, October 1, 2016 - 10 a.m.-5 p.m.</Date>
<DateYear>2016</DateYear>
<DateMonth>10</DateMonth>
<DateDay>01</DateDay>
<Body>Blah</Body>
所以任何帮助都会受到赞赏。
答案 0 :(得分:0)
您在输出中收到重复的节点,因为您要将模板应用于相同的节点两次。例如,你这样做:
<xsl:apply-templates select="title" />
然后:
<xsl:apply-templates select="*[not(self::Location or self::EventType)]" />
title
元素既不是Location
也不是EventType
,因此第二条指令再次应用模板。