在文件夹中按文件打开文件

时间:2017-07-23 12:27:37

标签: python xml

我是使用python进行编程的新手,但是目前我收到了一个编写脚本的任务,该脚本将我写下来,所有ID都是type = 0或type = 1。它是一个XML文件,看起来像这个例子:

<root>
<bla1 type="0" id = "1001" pvalue:="djdjd"/>
<bla2 type="0" id = "1002" pvalue:="djdjd" />
<bla3 type="0" id = "1003" pvalue:="djdjd"/>
<bla4 type="0" id = "1004" pvalue:="djdjd"/>
<bla5 type="0" id = "1005" pvalue:="djdjd"/>
<bla6 type="1" id = "1006" pvalue:="djdjd"/>
<bla7 type="0" id = "1007" pvalue:="djdjd"/>
<bla8 type="0" id = "1008" pvalue:="djdjd"/>
<bla9 type="1" id = "1009" pvalue:="djdjd"/>
<bla10 type="0" id = "1010" pvalue:="djdjd"/>
<bla11 type="0" id = "1011" pvalue:="djdjd"/>
<bla12 type="0" id = "1009" pvalue:="djdjd"/>

<root>

所以代码所做的第一件事就是基本上替换&#39;:=&#39;用&#39; =&#39;导致我的xml上传导致错误。无论如何,它写下了ID,类型为0,ID为1,类型为1.这适用于一个xml文件。不幸的是,我有更多只有一个文件,我需要像一个循环,总是打开文件夹中的下一个xml文件(不同的名称),并始终将新的ID添加到最后的ID中XML。所以基本上它总是从新的xml文件中添加新发现的id。

import xml.etree.cElementTree as ET # required import

    XmlFile = 'ID3.xml'  # insert here the name of the XML-file, which needs to be inside the same folder as the .py file

    my_file = open('%s' % XmlFile, "r+")  # open the XML-file
    Xml2String = my_file.readlines()  # convert the file into a list strings

    XmlFile_new = []  # new list, which is filled with the modified strings
    L = len(Xml2String)  # length of the string-list
    for i in range(1, L):  # Increment starts at 0, therefore, the first line is ignored
        if ':=' in Xml2String[i]:
            XmlFile_new.append(Xml2String[i].replace(':=', '='))    # get rid of colon
        else:
            XmlFile_new.append(Xml2String[i])

    tree = ET.ElementTree(XmlFile_new)
    root = tree.getroot()

    id_0 = []   # list for id="0"
    id_1 = []   # list for id="1"
    id_one2zero = []    # list for ids, that occur twice

    for i in range(len(root)):
        if 'type="0"' in root[i]:   # check for type
            a = root[i].index("id") + 5  # search index of id
            b = a+6
            id_0.append((root[i][a:b]))  # the id is set via index slicing
        elif 'type="1"' in root[i]:  # check for type
            a = root[i].index("id") + 5
            b = a+6
            id_1.append((root[i][a:b]))
        else:
            print("Unknown type occurred")  # If there's a line without type="0" or type="1", this message gets printed
            #  (Remember: first line of the xml-file is ignored)

    for i in range(len(id_0)):  # check for ids, that occur twice
        for j in range(len(id_1)):
            if id_0[i] == id_1[j]:
                id_one2zero.append(id_0[i])
    print(id_0)
    print(id_1)
    f = open('write.xml','w')
    print >>f, 'whatever'
    print('<end>')

1 个答案:

答案 0 :(得分:0)

解决此问题的简单方法是使用os.walk()函数。有了它,您可以在一个目录中打开所有文件,甚至可以递归打开。

以下是如何使用它的示例:

for root, dirs, files in os.walk("your/path"):
    for file in files:
        # process your file

如果您的目录中还有除xml文件之外的其他文件,则可以确保只使用file.endswith(".xml")处理xml文件。