如何在Python / Django中使用XPATH将子树数据添加到主树

时间:2011-09-23 16:37:17

标签: python xml django dom xpath

我正在使用etree来解析外部xml文件,并尝试从下面的外部xml文件中获取树中的listing data,并将子树agancy数据添加到其中。我可以单独提取istingagancy的数据,但不知道如何合并它们以便listing获取正确的agency信息。

xml:

<response>
    <listing>
        <bathrooms>2.1</bathrooms>
        <bedrooms>3</bedrooms>
        <agency>
            <name>Bob's Realty</name>
            <phone>555-693-4356</phone>
        </agency>
    </listing>
    <listing>
        <bathrooms>3.1</bathrooms>
        <bedrooms>5</bedrooms>
        <agency>
            <name>Larry's Homes</name>
            <phone>555-324-6532</phone>
        </agency>
    </listing>
</response>

蟒:

tree = lxml.etree.parse("http://www.someurl.com?random=blahblahblah")
listings = tree.xpath("/response/listing")
agencies = tree.xpath("/response/listing/agency")

listings_info = []

for listing in listings:
    this_value = {
        "bedrooms":listing.findtext("bedrooms"),
        "bathrooms":listing.findtext("bathrooms"),
        }

        for agency in agencies:
            this_value['agency']= agency.findtext("name")


    listings_info.append(this_value)

我尝试在listing_info.append(this_value)出现的地方的某个位置添加此内容,但这不正确,只需将最后一个代理商值附加到每个商家信息中。

我正在将数据输出到json中,这就是它的样子(你可以看到一个机构的信息如何被放入两个结果中:

    {"listings":[{"agency": "Bob's Realty", "phone":"555-693-4356" "bathrooms": "2.1", "bedrooms": "3"},{"agency": "Bob's Realty", "phone":"555-693-4356" "bathrooms": "3.1", "bedrooms": "5"} ]}

如何在原始response/listing/agency声明中将response/listing的数据与for合并?

1 个答案:

答案 0 :(得分:1)

您可以在遍历列表时使用listing.xpath('agency/name/text()')[0]来获取该列表的代理商名称。

for listing in listings:
    this_value = {
        'bedrooms': listing.findtext('bedrooms'),
        'bathrooms': listing.findtext('bathrooms'),
        'agency': listing.xpath('agency/name/text()')[0]
    }
    listings_info.append(this_value)