解析根节点,以获取整个文件结构?

时间:2019-07-03 12:54:53

标签: python xml tree xml-parsing

我的 python 脚本读取一个 XML文件,以提供文件夹结构

我的XML文件:

<?xml version="1.0" encoding="utf-8"?>
<folderstructure>
  <folder name="Fail">
    <folder name="Cam 1">
      <folder name="Mod1">
        <folder name="2019-04-09" />
      </folder>
    </folder>
    <folder name="Cam 2">
      <folder name="Mod1">
        <folder name="2019-04-09" />
      </folder>
    </folder>
  </folder>
  <folder name="Pass">
    <folder name="Cam 1">
      <folder name="Mod1">
        <folder name="2019-04-09" />
      </folder>
    </folder>
    <folder name="Cam 2">
      <folder name="Mod1">
        <folder name="2019-04-09" />
      </folder>
    </folder>
  </folder>
</folderstructure>

我参考Fetching the path( from root node ) for all the leaf nodes(我的上一个问题)编写了以下脚本:

def walk(e, runningPath='', flag = 1):
    name = e.attrib['name']

    if len(e)>0:
            runningPath += '/' + name
    children = [walk(c, runningPath, 0) for c in e if ((e.tag == 'folderstructure' and flag==1) or (e.tag=='folder' and flag == 0))]
    print(children)
    return {'name': name, 'children': children} if children else {'name': name, 'path': runningPath + '/' + name}

但是上面的脚本会产生'None'作为输出

我想要的输出是:

{'children': [{'children': [{'children': [{'children': [{'name': '2019-04-09',
                                                         'path': '/Fail/Cam '
                                                                 '1/Mod1/2019-04-09'}],
                                           'name': 'Mod1'}],
                             'name': 'Cam 1'},
                            {'children': [{'children': [{'name': '2019-04-09',
                                                         'path': '/Fail/Cam '
                                                                 '2/Mod1/2019-04-09'}],
                                           'name': 'Mod1'}],
                             'name': 'Cam 2'}],
               'name': 'Fail'},
              {'children': [{'children': [{'children': [{'name': '2019-04-09',
                                                         'path': '/Pass/Cam '
                                                                 '1/Mod1/2019-04-09'}],
                                           'name': 'Mod1'}],
                             'name': 'Cam 1'},
                            {'children': [{'children': [{'name': '2019-04-09',
                                                         'path': '/Pass/Cam '
                                                                 '2/Mod1/2019-04-09'}],
                                           'name': 'Mod1'}],
                             'name': 'Cam 2'}],
               'name': 'Pass'}]
}

我该如何解决这个问题?

1 个答案:

答案 0 :(得分:1)

如果发生某些异常,您的函数将返回None。使用块try: except可以捕获任何异常,因此您无法面对问题的原因,请尝试从代码中删除该块以查看问题,或者捕获更具体的异常。 而且我发现'folderstructure'没有name,您可以通过添加以下内容来解决此问题 XML中的<folderstructure name='some name'> 或为root元素设置默认名称。 下面的代码似乎可以正常工作:

def walk(e, runningPath='', flag = 1):
        try:
            name = e.attrib['name']
        except KeyError:
            name = 'root'

        if len(e)>0:
            runningPath += '/' + name
        children = [walk(c, runningPath, 0) for c in e if ((e.tag == 'folderstructure' and flag==1) or (e.tag=='folder' and flag == 0))]
        print(children)
        return {'name': name, 'children': children} if children else {'name': name, 'path': runningPath + '/' + name}