Question

我有一个xml文件，列出了需要完成的工作。我希望能够用python解析它。这是我的示例XML文件

XML代码：

<?xml version="1.0" encoding="ISO-8859-1"?>

<Jobs>

<Job name="Leo" type="upload">
    <File name="Leo.csv" source="/leegin/leo/OU" destination="/leegin/leo/OU/scripts" archive="/leegin/leo/OU/history" date="1" del="1" stat="1"/>
    <File name="Leo2.csv" source="/leegin/leo/OU" destination="/leegin/leo/OU/scripts" archive="/leegin/leo/OU/history" date="1" del="1" stat="1"/>
    <Log name="Leo.txt" path="/leegin/leo/OU/log"/>
    <Notify name="Leo Cruz" email="lcruz@me.com"/>
    <ftp port="21" proto="0" pasvmode="0" mode="0"/>
</Job>

<Job name="Manny" type="download">
    <File name="Manny.csv" source="/leegin/leo/OU" destination="/leegin/leo/OU/scripts" archive="/leegin/leo/OU/history" date="1" del="1" stat="1"/>
    <File name="Manny2.csv" source="/leegin/leo/OU" destination="/leegin/leo/OU/scripts" archive="/leegin/leo/OU/history" date="1" del="1" stat="1"/>
    <Log name="Manny.txt" path="/leegin/leo/OU/log"/>
    <Notify name="Manny Caparas" email="mcaparas@me.com"/>
    <ftp port="21" proto="0" pasvmode="0" mode="0"/>
</Job>

<Job name="Joe" type="copy">
    <File name="Joe.csv" source="/leegin/leo/OU" destination="/leegin/leo/OU/scripts" archive="/leegin/leo/OU/history" date="1" del="1" stat="1"/>
    <File name="Joe2.csv" source="/leegin/leo/OU" destination="/leegin/leo/OU/scripts" archive="/leegin/leo/OU/history" date="1" del="1" stat="1"/>
    <Log name="Joe.txt" path="/leegin/leo/OU/log"/>
    <Notify name="Joe Gomez" email="jgomez@me.com"/>
    <ftp port="21" proto="0" pasvmode="0" mode="0"/>
</Job>

</Jobs>

Python代码：

#!/usr/bin/python2.6

import sys
import optparse

def main():
    desc="""This script is used to setup and run an Automator job."""
    parser = optparse.OptionParser()
    parser.description = desc
    parser.add_option('-j', dest='jobname', type='str', action='store', help='Name of job to execute', metavar='[JobName]')
    parser.add_option('-v', dest='verbose', action='store_true', default=False, help='Used to view scripts debug information.')
    (options, args) = parser.parse_args()

    mandatory_options = ['jobname']
    for m in mandatory_options:
        if not options.__dict__[m]:
            print 'Options -j is required.'
            parser.print_help()
            sys.exit(-1)

    getjob(options.jobname)

def getjob(task):
    from xml.etree import ElementTree
    from xml.etree.ElementTree import Element
    from xml.etree.ElementTree import SubElement

    doc = ElementTree.parse('/opt/automize/template/jobs.xml')

    Files = doc.findall("./Job/File")
    for File in Files:
        print File.attrib['name']

if __name__ == '__main__':  
    main()

好的，我要做的是给python脚本一个作业名称，然后让脚本在XML文件中找到作业，并只提取与特定作业有关的部分。

到目前为止，我已经能够构建所有作业或所有文件的列表。虽然我没有能够为特定的工作做到这一点。我真的很感激这个问题的一些指导。

Answer 1

您正在使用的findall方法采用模式参数，其中：

可以是标记名称，也可以是path expression。如果给出了标记名称，则仅检查直接子元素。路径表达式可用于搜索整个子树。

如果您按照“路径表达式”链接，您会看到它是XPath的子集。因此，您只需要知道以XPath术语（或者更确切地说，在etree支持的XPath子集中）指定查询的正确方法。

您的查询要求所有File个节点下的所有Job个节点。要使用属性File请求所有Job个节点下的所有name='Manny'个节点，只需使用Job[@name='Manny']代替Job。

所以：

doc.findall("./Job[@name='{}']/File".format(task))

不幸的是，etree 1.2中的XPath功能比1.3中更不完整，我相信Python 2.6内置了1.2，所以这可能不适合你。（我相信如果这是真的，这将立即显而易见 - 路径模式编译器将引发一个异常，告诉您正在使用它从未听说过的分隔符或运算符 - 而不是，例如，似乎工作但实际上没有匹配任何东西。）

明显的解决方案是：

使用Python 2.7（或3.x）代替2.6。
安装1.3（请参阅here）并使用它而不是内置实现。
下载1.3（相同链接），将其ElementTree.py和ElementPath.py文件复制到您的项目中，然后导入它们。
安装lxml并使用其实现而不是参考实现。

需要帮助解析使用python的XML文件

1 个答案: