Question

我在与脚本相同的目录中有一个名为junit.xml的XML文件，我可以通过执行以下操作来解析它：

xml_file = os.path.abspath(__file__)
xml_file = os.path.dirname(xml_file)
xml_file = os.path.join(xml_file, "junit.xml")
root = ET.parse(xml_file).getroot();  # Where ET is the element tree

一切都很好。

但是，我有一个更复杂的例子，我需要解析连续不同目录中具有相同名称“junit.xml”的一堆文件。

目录如下：

\myhome\ireland\modules\builds\date1
\myhome\ireland\modules\builds\date2
\myhome\england\modules\builds\date1
\myhome\england\modules\builds\date2
\myhome\scotland\modules\builds\date1
\myhome\scotland\modules\builds\date2
\myhome\wales\modules\builds\date1
\myhome\wales\modules\builds\date2
\myhome\germany\modules\builds\date1
\myhome\germany\modules\builds\date2

现在，每个目录都包含XML文件的集合。我只想获得名为junit.xml的所有文件：

\myhome\ireland\modules\builds\date2
\myhome\england\modules\builds\date2
\myhome\scotland\modules\builds\date2

我怎样才能以pythonic的方式做到这一点，在那里我可以改变国家的名称和我需要的日期？

Answer 1

使用字符串模板作为路径，例如：

path = r"\myhome\{}\modules\builds\date{}"

您可以稍后使用str.format()函数构建实际路径（例如path.format("ireland", 1)）。

然后，您可以遍历国家/地区名称和日期，并为每个人解析XML文件：

for country in ["ireland", "england", "scotland"]:
    for num in [1, 2]:
        parse_xml(path.format(country, num))

其中parse_xml是您定义的函数，它获取XML文件的路径并对其进行解析。

Answer 2

首先，定义您的文件将遵循的“模板”，然后是国家/地区列表和日期列表：

dir_template = r'\myhome\%(country)s\modules\builds\%(date)s\junit.xml'
countries = ['ireland', 'england', 'scotland', 'wales', 'germany']
dates = ['date1', 'date2']

for c in countries:
    for d in dates:
        xml_file = dir_template % {'country': c, 'date': d}
        root = ET.parse(xml_file).getroot()
        # ...

Answer 3

countries = ['england','wales','germany','etc']
countrypath = '\myhome\{}\modules\builds'
filename = 'junit.xml'
for country in countries:
    path = countrypath.format(country)
    for item in os.listdir(countrypath):
        if os.path.isdir(item) and item.startswith('date'):
            os.path.join(path, item, filename)

Answer 4

预先设置候选目录列表效率不高，但您也可以使用junit.xml递归查找os.walk个文件，如下所示：

import os

def get_junit_filenames(directory):
    for dirpath, dirnames, filenames in os.walk(directory):
        if 'junit.xml' in filenames:
            yield os.path.join(dirpath, 'junit.xml')

for filename in get_junit_filenames('/myhome'):
    <process file>

这样您就不必担心在文件系统中添加/删除目录，因为junit.xml文件无论如何都会有变化。

Answer 5

    date = "dateX"
    countries = [ "ireland", "wales", "england"]

    for country in countries:
       path = "\myhome\%(country)s\modules\builds\%(date)s\junit.xml" \
% {"country" : country, "date": date}
       # check to see if the file you want is there?
       if os.path.exists(path):
           root = ET.parse(path).getroot();

＆＃34; os＆＃34;模块有一个名为＆＃34; walk＆＃34;它允许您遍历整个目录子树。你可能想看看你想要发现＆＃34;所有文件都称为junit.xml并处理它们。

从多个位置读取相同的文件名

5 个答案: