我有一个已解析的XML树,并使用<url>
节点获取了最后添加的<lastmod>
节点。我如何&#34;保存&#34;树中的节点位置并使用它来获取它所属的<url>
中的其他节点?
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:image="http://www.google.com/schemas/sitemap-image/1.1">
<url>
<loc>https://www.website.com/</loc>
<changefreq>daily</changefreq>
</url>
<url>
<loc>https://www.website.com/location/</loc>
<lastmod>2016-10-13T06:03:41Z</lastmod>
<changefreq>daily</changefreq>
<image:image>
<image:loc>https://website.com/image/</image:loc>
<image:title>Title of Item</image:title>
</image:image>
</url>
<url>
<loc>https://www.website.com/location/</loc>
<lastmod>2016-09-15T07:11:22Z</lastmod>
<changefreq>daily</changefreq>
<image:image>
<image:loc>https://website.com/image/</image:loc>
<image:title>Title of Item</image:title>
</image:image>
</url>
</urlset>
第一个<url>
标记是基于两个<url>
标记的XML文档的最新添加内容。但是,您必须循环遍历整个XML文档才能找到答案。你如何保存&#34;位置&#34;那个XML标签后来获得<image:title>
?这是我的代码:
tree = get_xml_data(line)
jul_newest = 0.0 # establish a comparison value for the newest addition
for child in tree:
if child.tag.endswith("url"):
for c in child:
if c.tag.endswith("lastmod"):
xml_date = c.text
year = float(xml_date[0:4])
month = float(xml_date[5:7])
day = float(xml_date[8:10])
hour = float(xml_date[11:13])
minute = float(xml_date[14:16])
second = float(xml_date[17:19])
# calculate Julian day number of recent addition
jul_day = julian(year, month, day, hour, minute, second)
if jul_day > jul_newest:
nt.set_year(int(year))
nt.set_month(int(month))
nt.set_day(int(day))
nt.set_hour(int(hour))
nt.set_minute(int(minute))
nt.set_second(int(second))
jul_newest = jul_day
nt.set_jul(jul_day)
# find loc of the latest addition
for child in tree:
if child.tag.endswith("url"):
for c in child:
if c.tag.endswith("loc"):
nt.set_location(c.text)