I have this xpath:
/document/offers/offer/concat(price/text(), for $r in . return 'default-value'[not($r/price/text())])
which solves my problem (default value for missing tags) for this document:
<document>
<company>
<ceo>Elon Musk</ceo>
<employees>13058</employees>
<address>
<city>Palo Alto</city>
<state>California</state>
<country>USA</country>
</address>
</company>
<offers>
<offer avail="0">
<id>1</id>
<model>Tesla Roadster</model>
<imageUrl>https://www.teslamotors.com/sites/default/files/styles/blog-picture_2x_1400xvar_/public/0H8E6227_1.jpg</imageUrl>
</offer>
<offer avail="1">
<id>2</id>
<model>Tesla Model S</model>
<price>63400.00</price>
<offerUrl>https://www.teslamotors.com/models</offerUrl>
<imageUrl>https://www.teslamotors.com/tesla_theme/assets/img/models/section-initial.jpg</imageUrl>
</offer>
<offer avail="1">
<id>3</id>
<model>Tesla Model X</model>
<price>69300.00</price>
<offerUrl>https://www.teslamotors.com/modelx</offerUrl>
<imageUrl>https://www.teslamotors.com/tesla_theme/assets/img/modelx/section-exterior-profile.jpg</imageUrl>
</offer>
<offer avail="1">
<id>4</id>
<model>Tesla Model 3</model>
<price>35000.00</price>
<offerUrl>https://www.teslamotors.com/model3</offerUrl>
<imageUrl>https://www.teslamotors.com/sites/default/files/images/model-3/gallery/gallery-1.jpg</imageUrl>
</offer>
</offers>
</document>
by returning:
default-value
63400.00
69300.00
35000.00
According to http://videlibri.sourceforge.net/cgi-bin/xidelcgi, this works but I cannot make this work with lxml in python. Right now I don't know even how to google equivalent of this type of xpath. So... how those "inner fors" are called in xpaths?
答案 0 :(得分:3)
That is an xpath2 for-expressions which is not supported by lxml or xml.etree in python. You could replicate it using a for loop
from lxml import etree
xml = etree.parse("the_file")
for node in xml.xpath("//document/offers/offer"):
pr = node.xpath("./price")
print(pr[0].text if pr else "Default-value")
Which would give you:
Default-value
63400.00
69300.00
35000.00