使用ElementTree Python3解析XML

时间:2015-01-12 23:54:13

标签: python xml

我很难解析。我整个星期都在努力,并且做得很少。 这是XML im的一个例子,它是我从亚马逊api获得的一个电话。我想收集的是ASIN,ItemCondition,FulfillmentChannel,SellerFeedbackCount,ListingPrice和Shipping。稍后我想将此信息保存到csv文件中。

<?xml version="1.0"?>
<GetLowestOfferListingsForASINResponse xmlns="http://mws.amazonservices.com/schema/Products/2011-10-01">
<GetLowestOfferListingsForASINResult ASIN="B00CBY103E" status="Success">
  <AllOfferListingsConsidered>true</AllOfferListingsConsidered>
  <Product xmlns="http://mws.amazonservices.com/schema/Products/2011-10-01" xmlns:ns2="http://mws.amazonservices.com/schema/Products/2011-10-01/default.xsd">
    <Identifiers>
      <MarketplaceASIN>
        <MarketplaceId>ATVPDKIKX0DER</MarketplaceId>
        <ASIN>**B00CBY103E**</ASIN>
      </MarketplaceASIN>
    </Identifiers>
    <LowestOfferListings>
      <LowestOfferListing>
        <Qualifiers>
          <ItemCondition>**New**</ItemCondition>
          <ItemSubcondition>New</ItemSubcondition>
          <FulfillmentChannel>**Merchant**</FulfillmentChannel>
          <ShipsDomestically>True</ShipsDomestically>
          <ShippingTime>
            <Max>0-2 days</Max>
          </ShippingTime>
          <SellerPositiveFeedbackRating>98-100%</SellerPositiveFeedbackRating>
        </Qualifiers>
        <NumberOfOfferListingsConsidered>4</NumberOfOfferListingsConsidered>
        <SellerFeedbackCount>**163**</SellerFeedbackCount>
        <Price>
          <LandedPrice>
            <CurrencyCode>USD</CurrencyCode>
            <Amount>19.99</Amount>
          </LandedPrice>
          <ListingPrice>
            <CurrencyCode>USD</CurrencyCode>
            <Amount>**19.99**</Amount>
          </ListingPrice>
          <Shipping>
            <CurrencyCode>USD</CurrencyCode>
            <Amount>**0.00**</Amount>
          </Shipping>
        </Price>
        <MultipleOffersAtLowestPrice>True</MultipleOffersAtLowestPrice>
      </LowestOfferListing>
      <LowestOfferListing>
        <Qualifiers>
          <ItemCondition>New</ItemCondition>
          <ItemSubcondition>New</ItemSubcondition>
          <FulfillmentChannel>Merchant</FulfillmentChannel>
          <ShipsDomestically>True</ShipsDomestically>
          <ShippingTime>
            <Max>0-2 days</Max>
          </ShippingTime>
          <SellerPositiveFeedbackRating>90-94%</SellerPositiveFeedbackRating>
        </Qualifiers>
        <NumberOfOfferListingsConsidered>1</NumberOfOfferListingsConsidered>
        <SellerFeedbackCount>5486</SellerFeedbackCount>
        <Price>
          <LandedPrice>
            <CurrencyCode>USD</CurrencyCode>
            <Amount>19.99</Amount>
          </LandedPrice>
          <ListingPrice>
            <CurrencyCode>USD</CurrencyCode>
            <Amount>19.99</Amount>
          </ListingPrice>
          <Shipping>
            <CurrencyCode>USD</CurrencyCode>
            <Amount>0.00</Amount>
          </Shipping>
        </Price>
        <MultipleOffersAtLowestPrice>False</MultipleOffersAtLowestPrice>
      </LowestOfferListing>
      <LowestOfferListing>
        <Qualifiers>
          <ItemCondition>New</ItemCondition>
          <ItemSubcondition>New</ItemSubcondition>
          <FulfillmentChannel>Merchant</FulfillmentChannel>
          <ShipsDomestically>True</ShipsDomestically>
          <ShippingTime>
            <Max>0-2 days</Max>
          </ShippingTime>
          <SellerPositiveFeedbackRating>95-97%</SellerPositiveFeedbackRating>
        </Qualifiers>
        <NumberOfOfferListingsConsidered>1</NumberOfOfferListingsConsidered>
        <SellerFeedbackCount>1188</SellerFeedbackCount>
        <Price>
          <LandedPrice>
            <CurrencyCode>USD</CurrencyCode>
            <Amount>19.99</Amount>
          </LandedPrice>
          <ListingPrice>
            <CurrencyCode>USD</CurrencyCode>
            <Amount>19.99</Amount>
          </ListingPrice>
          <Shipping>
            <CurrencyCode>USD</CurrencyCode>
            <Amount>0.00</Amount>
          </Shipping>
        </Price>
        <MultipleOffersAtLowestPrice>False</MultipleOffersAtLowestPrice>
      </LowestOfferListing>
      <LowestOfferListing>
        <Qualifiers>
          <ItemCondition>New</ItemCondition>
          <ItemSubcondition>New</ItemSubcondition>
          <FulfillmentChannel>Merchant</FulfillmentChannel>
          <ShipsDomestically>True</ShipsDomestically>
          <ShippingTime>
            <Max>3-7 days</Max>
          </ShippingTime>
          <SellerPositiveFeedbackRating>95-97%</SellerPositiveFeedbackRating>
        </Qualifiers>
        <NumberOfOfferListingsConsidered>1</NumberOfOfferListingsConsidered>
        <SellerFeedbackCount>2240</SellerFeedbackCount>
        <Price>
          <LandedPrice>
            <CurrencyCode>USD</CurrencyCode>
            <Amount>23.97</Amount>
          </LandedPrice>
          <ListingPrice>
            <CurrencyCode>USD</CurrencyCode>
            <Amount>19.99</Amount>
          </ListingPrice>
          <Shipping>
            <CurrencyCode>USD</CurrencyCode>
            <Amount>3.98</Amount>
          </Shipping>
        </Price>
        <MultipleOffersAtLowestPrice>False</MultipleOffersAtLowestPrice>
      </LowestOfferListing>
      <LowestOfferListing>
        <Qualifiers>
          <ItemCondition>New</ItemCondition>
          <ItemSubcondition>New</ItemSubcondition>
          <FulfillmentChannel>Merchant</FulfillmentChannel>
          <ShipsDomestically>True</ShipsDomestically>
          <ShippingTime>
            <Max>8-13 days</Max>
          </ShippingTime>
          <SellerPositiveFeedbackRating>90-94%</SellerPositiveFeedbackRating>
        </Qualifiers>
        <NumberOfOfferListingsConsidered>1</NumberOfOfferListingsConsidered>
        <SellerFeedbackCount>4377</SellerFeedbackCount>
        <Price>
          <LandedPrice>
            <CurrencyCode>USD</CurrencyCode>
            <Amount>27.99</Amount>
          </LandedPrice>
          <ListingPrice>
            <CurrencyCode>USD</CurrencyCode>
            <Amount>19.99</Amount>
          </ListingPrice>
          <Shipping>
            <CurrencyCode>USD</CurrencyCode>
            <Amount>8.00</Amount>
          </Shipping>
        </Price>
        <MultipleOffersAtLowestPrice>False</MultipleOffersAtLowestPrice>
      </LowestOfferListing>
    </LowestOfferListings>
  </Product>
</GetLowestOfferListingsForASINResult>
<GetLowestOfferListingsForASINResult ASIN="B00ERXG4TM" status="Success">
  <AllOfferListingsConsidered>true</AllOfferListingsConsidered>
  <Product xmlns="http://mws.amazonservices.com/schema/Products/2011-10-01" xmlns:ns2="http://mws.amazonservices.com/schema/Products/2011-10-01/default.xsd">
    <Identifiers>
      <MarketplaceASIN>
        <MarketplaceId>ATVPDKIKX0DER</MarketplaceId>
        <ASIN>B00ERXG4TM</ASIN>
      </MarketplaceASIN>
    </Identifiers>
    <LowestOfferListings>
      <LowestOfferListing>
        <Qualifiers>
          <ItemCondition>New</ItemCondition>
          <ItemSubcondition>New</ItemSubcondition>
          <FulfillmentChannel>Amazon</FulfillmentChannel>
          <ShipsDomestically>True</ShipsDomestically>
          <ShippingTime>
            <Max>0-2 days</Max>
          </ShippingTime>
          <SellerPositiveFeedbackRating>95-97%</SellerPositiveFeedbackRating>
        </Qualifiers>
        <NumberOfOfferListingsConsidered>1</NumberOfOfferListingsConsidered>
        <SellerFeedbackCount>80</SellerFeedbackCount>
        <Price>
          <LandedPrice>
            <CurrencyCode>USD</CurrencyCode>
            <Amount>32.95</Amount>
          </LandedPrice>
          <ListingPrice>
            <CurrencyCode>USD</CurrencyCode>
            <Amount>32.95</Amount>
          </ListingPrice>
          <Shipping>
            <CurrencyCode>USD</CurrencyCode>
            <Amount>0.00</Amount>
          </Shipping>
        </Price>
        <MultipleOffersAtLowestPrice>False</MultipleOffersAtLowestPrice>
      </LowestOfferListing>
      <LowestOfferListing>
        <Qualifiers>
          <ItemCondition>New</ItemCondition>
          <ItemSubcondition>New</ItemSubcondition>
          <FulfillmentChannel>Merchant</FulfillmentChannel>
          <ShipsDomestically>True</ShipsDomestically>
          <ShippingTime>
            <Max>0-2 days</Max>
          </ShippingTime>
          <SellerPositiveFeedbackRating>95-97%</SellerPositiveFeedbackRating>
        </Qualifiers>
        <NumberOfOfferListingsConsidered>1</NumberOfOfferListingsConsidered>
        <SellerFeedbackCount>80</SellerFeedbackCount>
        <Price>
          <LandedPrice>
            <CurrencyCode>USD</CurrencyCode>
            <Amount>32.95</Amount>
          </LandedPrice>
          <ListingPrice>
            <CurrencyCode>USD</CurrencyCode>
            <Amount>32.95</Amount>
          </ListingPrice>
          <Shipping>
            <CurrencyCode>USD</CurrencyCode>
            <Amount>0.00</Amount>
          </Shipping>
        </Price>
        <MultipleOffersAtLowestPrice>False</MultipleOffersAtLowestPrice>
      </LowestOfferListing>
    </LowestOfferListings>
  </Product>
</GetLowestOfferListingsForASINResult>
<ResponseMetadata>
  <RequestId>362ff218-44da-4c1c-97d0-d8cb3ce81bef</RequestId>
</ResponseMetadata>
</GetLowestOfferListingsForASINResponse>

这是我目前的代码:

import csv
import xml.etree.ElementTree as ET

tree = ET.parse('outputmulti.xml')
root = tree.getroot()
for sku in root.findall('{http://mws.amazonservices.com/schema/Products/2011-10-01}GetLowestOfferListingsForASINResult/{http://mws.amazonservices.com/schema/Products/2011-10-01}Product'):
    asin = sku.find('.//{http://mws.amazonservices.com/schema/Products/2011-10-01}ASIN').text
    Condition = sku.find('.//{http://mws.amazonservices.com/schema/Products/2011-10-01}ItemCondition').text
    Fulfillment = sku.find('.//{http://mws.amazonservices.com/schema/Products/2011-10-01}FulfillmentChannel').text
    FeedBack = sku.find('.//{http://mws.amazonservices.com/schema/Products/2011-10-01}SellerFeedbackCount').text
    print(asin, Condition, Fulfillment, FeedBack)

但是,这个解析只给出了缺少几行数据的跟随:

B00CBY103E New Merchant 163
B00ERXG4TM New Amazon 80

从外观上看,我的代码只收集第一个&#34; LowestOfferListing&#34;对于每个产品/ ASIN并跳过剩余的。我需要做些什么来收集&#34; LowestOfferListing&#34;的完整列表?

另外注意,这个XML文档只是整个文档大小的一小部分。元素树是否会通过一个10倍这么大的文档来解决问题。

~~ UPDATE ~~

我越来越近了。继承我的代码和我得到的输出。

Competitors = root.findall('.//{http://mws.amazonservices.com/schema/Products/2011-10-01}LowestOfferListing')
for Competitor in Competitors:
    FeedBack = Competitor.find('{http://mws.amazonservices.com/schema/Products/2011-10-01}SellerFeedbackCount').text
    Condition = Competitor.find('{http://mws.amazonservices.com/schema/Products/2011-10-01}Qualifiers/{http://mws.amazonservices.com/schema/Products/2011-10-01}ItemCondition').text
    Fulfillment = Competitor.find('{http://mws.amazonservices.com/schema/Products/2011-10-01}Qualifiers/{http://mws.amazonservices.com/schema/Products/2011-10-01}FulfillmentChannel').text
    Price = Competitor.find('{http://mws.amazonservices.com/schema/Products/2011-10-01}Price/{http://mws.amazonservices.com/schema/Products/2011-10-01}ListingPrice/{http://mws.amazonservices.com/schema/Products/2011-10-01}Amount').text
    Shipping = Competitor.find('{http://mws.amazonservices.com/schema/Products/2011-10-01}Price/{http://mws.amazonservices.com/schema/Products/2011-10-01}Shipping/{http://mws.amazonservices.com/schema/Products/2011-10-01}Amount').text
    print(FeedBack, Condition, Fulfillment, Price, Shipping)

163 New Merchant 19.99 0.00
5486 New Merchant 19.99 0.00
1188 New Merchant 19.99 0.00
2240 New Merchant 19.99 3.98
4377 New Merchant 19.99 8.00
80 New Amazon 32.95 0.00
80 New Merchant 32.95 0.00

输出是正确的,除了数据缺少其相关Asin数的事实。我仍然无法获得Asin价值观。我假设这是因为我的for循环遍历位于ASIN下方的LowestOfferListing。我如何获得这些Asin数字? 谢谢你的帮助!!

1 个答案:

答案 0 :(得分:0)

当您执行以下操作时:

 LowestOffer.getElementsByTagName("Price")

它将返回一个数组。如果此数组的长度为零,则数据不存在。你可以这样做:

def get_data(elem, elem_name):
     results = elem.getElementsByTagName(elem_name)
     if results:
           return result[0]
     else:
           return None

 get_data(LowestOffer, "Price")