etree将节点属性插入已过滤的子级

时间:2019-02-21 23:56:46

标签: python xml lxml

我正在处理xml文件。我想创建一个作为元组列表的输出,以将其大量插入数据库中。

我似乎无法解决的问题是将节点的@id插入子节点的选定属性中。

这是我的示例文档。请注意,在我的真实文件中,每个级别中还有许多其他属性需要过滤掉。我将此XML文件创建为更有用的示例。

doc = """
<region id="5153419" name="North Shore" date="2019-02-15T00:00:00" >
  <shire abbrevname="Manly Council" code="20019" website="http://" >
  <location id="5178566" site="1" division="Dee Why" staff="3" >
    <reference isbn="978-1-891830-75-4" rating="Mature (18+)" title="110 Per¢" author="Tony Consiglio"/>
    <reference isbn="978-1-60309-2395" rating="Mature (16+)" title="American Elf 1999" author="James Kochalka" />
    <reference isbn="978-1-891830-37-2" rating="Young Adult (13+)" title="The Barefoot Serpent (softcover)" author="Scott Morse" />
    <reference isbn="978-1-891830-56-3" rating="Mature (16+)" title="Bighead" author="Jeffrey Brown"  />
    <reference isbn="978-1-891830-19-8" rating="Mature (18+)" title="Box Office Poison" author="Alex Robinson"  />
  </location>
  <location id="5178568" site="2" division="Brookvale" staff="5">
    <reference isbn="978-1-891830-37-2" rating="Young Adult (13+)" title="The Barefoot Serpent (softcover)" author="Scott Morse"/>
    <reference isbn="978-1-936561-69-8" rating="Adults Only (18+)" title="Chester 5000 (Book 2)" author="Isabelle George" />
    <reference isbn="978-1-891830-81-5" rating="Young Adult (13+)" title="Cry Yourself to Sleep" author="Jeremy Tinder" />
    <reference isbn="978-1-891830-75-4" rating="Mature (18+)" title="110 Per¢" author="Tony Consiglio" />
    <reference isbn="978-1-891830-77-8" rating="Mature (16+)" title="Every Girl is the End of the World for Me" author="Jeffrey Brown" />
    <reference isbn="978-0-9585783-4-9" rating="Mature (18+)" title="From Hell" author="Alan Moore and Eddie Campbell" />
  </location> 
  </shire>
</region>
"""

我想要的输出是

(位置ID,isbn,标题)

[(5153419, 978-1-891830-75-4,110 Per¢),(5153419, 978-1-60309-2395, American Elf 1999).......(5178568,978-0-9585783-4-9,From Hell)]

尝试了多种方法getiterator,findall。只是找不到实现它的方法。

filter_reference = ['isbn', 'title']
output_list = []
for child in tree.findall('.//reference'):
    for k,v in child.items():
        if k in filter_reference:
            output_list.append(v)

1 个答案:

答案 0 :(得分:1)

遍历子对象并获得所需的属性:

let url = URL(fileURLWithPath: Bundle.main.path(forResource: "ports", ofType: "geojson")!)