Question

我正在使用lxml创建基于xml的json文件。 xml文件具有这种结构：

<spots_list>
    <spot id="001" latitude="2011464" longitude="979511">
        <adress>Somewhere</adress>
        <city>BOSTON</city>
        <price category="Intermediate" value="782"/>
        <price category="Expensive" value="2765"/>
        <price category="Cheap" value="12"/>
     </spot>
    <spot id="002" latitude="2101644" longitude="915971">
        <adress>Somewhere else (very very far away)</adress>
        <city>CAMBRIDGE</city>
        <price category="Intermediate" value="472"/>
        <price category="Intermediate (but less expensive)" value="422"/>
        <price category="Expensive" value="20275"/>
        <price category="Cheap" value="12"/>
     </spot>
</spots_list>

每个元素中价格元素的数量可以改变，因此我尝试在Python中使用while循环。这是关联代码：

from lxml import etree

tree = etree.parse("my_file.xml")

for node in tree.xpath("//spots_list/spot"):
    for adress in node.xpath("adress"):
        adr = adress.text
    while node.xpath("price"):
        print(adr)

我知道这是错误的，因为第一个地址一遍又一遍地出现，但是我不知道如何制定此循环以切换到下一个元素...

谢谢。

Answer 1

while语句中的基本问题是node.xpath(...)返回一个列表，如果列表不为空，则将其视为True。您只需要做与顶层相同的操作，即遍历您感兴趣的元素，例如

def parse_spot(el):
    adr = el.find('adress')
    return dict(
        address=adr.text if adr is not None else None,  # error handling if not found
        price=[dict(p.attrib) for p in el.findall('price')]
    )

tree = etree.fromstring(xml)  # xml is your example as string

[parse_spot(el) for el in tree.findall('./spot')]

您也可以像以前一样使用xpath代替findall。

lxml：节点相同元素的while循环

1 个答案: