Question

我有一个需要用python解析的多级XML文件，我有xml或lxml。我该如何解析？我找不到任何有用的解决方案。请帮助我，非常感谢！理想情况下，我想解析XML文件并转换为Python DataFrame。这个for循环不起作用。

 for child in root:
     for element in child:
         for element in child:
             print(element.tag, element.attrib)

这是我打印出漂亮照片的结果的一部分。

<Listings>
  <Listing>
    <Location>
      <City>Amagansett</City>
      <State>NY</State>
      <Zip>11930</Zip>
      <Latitude>4.12</Latitude>
      <Longitude>2.13</Longitude>
      <DisplayAddress>No</DisplayAddress>
    </Location>
    <ListingDetails>
      <Status>For Rent</Status>
      <Price>120000</Price>
      <ListingUrl>http://www.co.com/listing.aspx?   Region=LI3&amp;ListingID=122</ListingUrl>
      <MlsId>122</MlsId>
      <DateListed>2011-06-10</DateListed>
      <NewDevelopment>N</NewDevelopment>
    </ListingDetails>
    <BasicDetails>
      <PropertyType>Other</PropertyType>
      <Description>Rental Registration #: the master suite has a lavish bath and its own terrace with small ocean views..</Description>
      <Bedrooms>5</Bedrooms>
      <Bathrooms>4</Bathrooms>
      <FullBathrooms>4</FullBathrooms>
      <HalfBathrooms>0</HalfBathrooms>
      <LivingArea>5775</LivingArea>
      <LotSize>0.8</LotSize>
    </BasicDetails>

  </Listing>
</Listings>

Answer 1

尝试使用xmltodict - 因为在我看来它更容易。

merge(big_df,small_df, by = c("date_birth","gender"))

我使用了Try并期望异常 - 因为XML文件通常具有不同的结构，因此您可能会在尝试获取不存在的内容时出错。

Answer 2

我想我明白了！这是我用的。

import xml.etree.ElementTree       
res=[]
for child in root:
    r=[]
    for element in child:
        for element in element:
            new=element.text
            r.append(new)

    res.append(r) 
print (res)

如何使用python解析多级XML文件

2 个答案: