使用lxml从xml中查找不同的节点和值

时间:2013-02-04 14:06:51

标签: python xml lxml

我弄清楚为什么lxml将解析我正在查看的部分xml而不是其他位。

以下代码段有效并为我提供了所需的所有标题:

doc = lh.fromstring(resp)
for product in doc.xpath('.//item'):
    prices = product.xpath(".//title/text()")

然而,简单的改变

doc = lh.fromstring(resp)
for product in doc.xpath('.//item'):
    prices = product.xpath(".//itemId/text()")

doc = lh.fromstring(resp)
for product in doc.xpath('.//item'):
    prices = product.xpath(".//globalId/text()")

只是将价格作为空结果的数量返回。

下面给出的XML ......

<findItemsByProductResponse>
  <ack>Success</ack>
  <version>1.12.0</version>
  <timestamp>2013-02-04T13:35:57.106Z</timestamp>
  <searchResult count="31">
    <item>
      <itemId>130842622974</itemId>
      <title>BONES - COMPLETE SEASON 4 - BLURAY</title>
      <globalId>EBAY-US</globalId>
    <primaryCategory>
      <categoryId>617</categoryId>
      <categoryName>DVDs & Blu-ray Discs</categoryName>
    </primaryCategory>
    <galleryURL>
      http://thumbs3.ebaystatic.com/m/mnuTBPOWZ-6F4kIHS1mj3gg/140.jpg
    </galleryURL>
    <viewItemURL>
      http://www.ebay.com/itm/BONES-COMPLETE-SEASON-4-BLURAY-/130842622974?pt=US_DVD_HD_DVD_Blu_ray
    </viewItemURL>
    <productId type="ReferenceID">78523575</productId>
    <paymentMethod>PayPal</paymentMethod>
    <autoPay>false</autoPay>
    <postalCode>60544</postalCode>
    <location>Plainfield,IL,USA</location>
    <country>US</country>
    <shippingInfo>
      <shippingServiceCost currencyId="USD">0.0</shippingServiceCost>
      <shippingType>Free</shippingType>
      <shipToLocations>Worldwide</shipToLocations>
      <expeditedShipping>true</expeditedShipping>
      <oneDayShippingAvailable>false</oneDayShippingAvailable>
      <handlingTime>1</handlingTime>
    </shippingInfo>
    <sellingStatus>
      <currentPrice currencyId="USD">12.99</currentPrice>
      <convertedCurrentPrice currencyId="USD">12.99</convertedCurrentPrice>  
      <sellingState>Active</sellingState>
      <timeLeft>P3DT23H12M7S</timeLeft>
    </sellingStatus>
    <listingInfo>
      <bestOfferEnabled>false</bestOfferEnabled>
      <buyItNowAvailable>false</buyItNowAvailable>
      <startTime>2013-01-29T12:48:04.000Z</startTime>
      <endTime>2013-02-08T12:48:04.000Z</endTime>
      <listingType>FixedPrice</listingType>
      <gift>false</gift>
    </listingInfo>
    <returnsAccepted>true</returnsAccepted>
    <condition>
      <conditionId>1000</conditionId>
      <conditionDisplayName>Brand New</conditionDisplayName>
    </condition>
    <isMultiVariationListing>false</isMultiVariationListing>
    <topRatedListing>true</topRatedListing>
  </item>

P.S。为了获得额外的奖励,我将会将找到convertedCurrentPrice作为下一步(我想我应该一次解决一个谜题) - 我将要使用的代码看起来像

doc = lh.fromstring(resp)
for product in doc.xpath('.//item'):
    prices = product.xpath(".//sellingStatus/convertedCurrentPrice/text()")

看起来是对的还是有更好的方法吗?

谢谢,

马特

1 个答案:

答案 0 :(得分:3)

尝试使用itemid,因为我认为lxml会将代码转换为小写。另外为什么不使用:

doc.xpath('.//item/itemid/text()")