如何解析XML并存储为列表(python)

时间:2018-07-23 12:51:37

标签: python xml excel elementtree

对不起,不得不再次询问。

我想通过xml.etree.ElementTree将xml文件转换为excel。

假设我的 xml 如下:

<ParameterCluster>
          <Name>AAAAAA</Name>
          <ParameterDefinitionList>
            <ParameterDefinition>
              <Name>LengthMin</Name>
              <Type>UInt8</Type>
            </ParameterDefinition>
            <ParameterDefinition>
              <Name>LengthMax</Name>
              <Type>UInt8</Type>
            </ParameterDefinition>
          </ParameterDefinitionList>

          <VariantImlementationList>
            <VariantImlementation>
              <MajorVariantList>
                <MajorVariant>A_Basis</MajorVariant>
              </MajorVariantList>
              <MinorVariantList>
                        <ParameterValue>
                          <ValueList>
                            <Value>47</Value>
                          </ValueList>
                          <ValueList>
                            <Value>80</Value>
                          </ValueList>
                        </ParameterValue>
              </MinorVariantList>
              <MajorVariantList>
                <MajorVariant>B_Basis</MajorVariant>
                <MajorVariant>C_Basis</MajorVariant>
              </MajorVariantList>
              <MinorVariantList>
                        <ParameterValue>
                          <ValueList>
                            <Value>47</Value>
                          </ValueList>
                          <ValueList>
                            <Value>40</Value>
                          </ValueList>
                        </ParameterValue>
              </MinorVariantList> 
            </VariantImlementation>
          </VariantImlementationList>
        </ParameterCluster>

这意味着,我有3个基础(A_basisB_basisC_basis)。

A_ Basis中,LengthMin的值为47,而LengthMax的值为80

但是在B_basisC_Basis中。 LengthMin的值为47,而LengthMax的值为40

所以我想得到类似的东西:

{'AAAAAA','LengthMin','UInt8','A_Basis',47}
{'AAAAAA','LengthMax','UInt8','A_Basis',80}
{'AAAAAA','LengthMin','UInt8','B_Basis',47}
{'AAAAAA','LengthMax','UInt8','B_Basis',40}
{'AAAAAA','LengthMin','UInt8','C_Basis',47}
{'AAAAAA','LengthMax','UInt8','C_Basis',40}

然后我可以将其写入excel文件。有可能获得这种清单吗?

1 个答案:

答案 0 :(得分:1)

对于解析XML,您可以使用BeautifulSoup代替xml.etree.ElementTree(界面更直观)。

解析非常简单(假设ParameterValue的长度始终与ParameterValue.ValueList相同:首先,您需要提取参数类型,然后遍历所有<MajorVariant>并填充结果列表。

如果BeautifulSoup没问题,下面是示例代码:

data = """<ParameterCluster>
              <Name>AAAAAA</Name>
              <ParameterDefinitionList>
                <ParameterDefinition>
                  <Name>LengthMin</Name>
                  <Type>UInt8</Type>
                </ParameterDefinition>
                <ParameterDefinition>
                  <Name>LengthMax</Name>
                  <Type>UInt8</Type>
                </ParameterDefinition>
              </ParameterDefinitionList>

              <VariantImlementationList>
                <VariantImlementation>
                  <MajorVariantList>
                    <MajorVariant>A_Basis</MajorVariant>
                  </MajorVariantList>
                  <MinorVariantList>
                            <ParameterValue>
                              <ValueList>
                                <Value>47</Value>
                              </ValueList>
                              <ValueList>
                                <Value>80</Value>
                              </ValueList>
                            </ParameterValue>
                  </MinorVariantList>
                  <MajorVariantList>
                    <MajorVariant>B_Basis</MajorVariant>
                    <MajorVariant>C_Basis</MajorVariant>
                  </MajorVariantList>
                  <MinorVariantList>
                            <ParameterValue>
                              <ValueList>
                                <Value>47</Value>
                              </ValueList>
                              <ValueList>
                                <Value>40</Value>
                              </ValueList>
                            </ParameterValue>
                  </MinorVariantList>
                </VariantImlementation>
              </VariantImlementationList>
            </ParameterCluster>"""


from bs4 import BeautifulSoup
from pprint import pprint

soup = BeautifulSoup(data, 'xml')

name, types = soup.select_one('Name'), []
for n, t in zip(soup.select('ParameterDefinitionList Name'), soup.select('ParameterDefinitionList Type')):
    types.append([name.text, n.text, t.text])

rv = []
for major, minor in zip(soup.select('MajorVariantList'), soup.select('MajorVariantList ~ MinorVariantList')):
    for mj in major.select('MajorVariant'):
        for i, mn in enumerate(minor.select('Value')):
            rv.append(types[i] + [mj.text, mn.text])

pprint(rv, width=120)

输出:

[['AAAAAA', 'LengthMin', 'UInt8', 'A_Basis', '47'],
 ['AAAAAA', 'LengthMax', 'UInt8', 'A_Basis', '80'],
 ['AAAAAA', 'LengthMin', 'UInt8', 'B_Basis', '47'],
 ['AAAAAA', 'LengthMax', 'UInt8', 'B_Basis', '40'],
 ['AAAAAA', 'LengthMin', 'UInt8', 'C_Basis', '47'],
 ['AAAAAA', 'LengthMax', 'UInt8', 'C_Basis', '40']]