问题

Question

我有一个包含5个数据部分的.txt文件。每个部分都有一个标题行“Section X”。我想从这个单独的文件中解析并编写5个单独的文件。该部分将从标题开始，并在下一个标题标题之前结束。下面的代码创建5个单独的文件;但是，它们都是空白的。

from itertools import cycle

filename = raw_input("Which file?: \n")

dimensionsList = ["Section 1", "Section 2",
    "Section 3", "Section 4", "Section 5"]

with open(filename+".txt", "rb") as oldfile:
    for i in dimensionsList:
        licycle = cycle(dimensionsList)
        nextelem = licycle.next()
        with open(i+".txt", "w") as newfile: 
            for line in oldfile:
                if line.strip() == i:
                    break
            for line in oldfile:
                if line.strip() == nextelem:
                    break
                newfile.write(line)

Answer 1

问题

测试你的代码，它只适用于第1部分（其他人对我来说也是空白的）。我意识到问题是Sections之间的转换（以及licycle在所有迭代中重新启动）。

第二节是在第二个for（if line.strip() == nextelem）阅读的。下一行是第2节的数据（而不是文本Section 2）。

语言很难，但请测试下面的代码：

from itertools import cycle

filename = raw_input("Which file?: \n")

dimensionsList = ["Section 1", "Section 2", "Section 3", "Section 4",
                  "Section 5"]

with open(filename + ".txt", "rb") as oldfile:
    licycle = cycle(dimensionsList)
    nextelem = licycle.next()
    for i in dimensionsList:
        print(nextelem)
        with open(i + ".txt", "w") as newfile:
            for line in oldfile:
                print("ignoring %s" % (line.strip()))
                if line.strip() == i:
                    nextelem = licycle.next()
                    break
            for line in oldfile:
                if line.strip() == nextelem:
                    # nextelem = licycle.next()
                    print("ignoring %s" % (line.strip()))
                    break
                print("printing %s" % (line.strip()))
                newfile.write(line)
            print('')

它将打印：

Section 1
ignoring Section 1
printing aaaa
printing bbbb
ignoring Section 2

Section 2
ignoring ccc
ignoring ddd
ignoring Section 3
ignoring eee
ignoring fff
ignoring Section 4
ignoring ggg
ignoring hhh
ignoring Section 5
ignoring iii
ignoring jjj

Section 2

Section 2

Section 2

它适用于第1部分，它检测第2部分，但它一直忽略这些行，因为它找不到＆＃34;第2节＆＃34;。

如果每次重新启动行（总是从第1行开始），我认为该程序可以正常工作。但我做了一个更简单的代码，应该适合你。

解决方案

from itertools import cycle

filename = raw_input("Which file?: \n")

dimensionsList = ["Section 1", "Section 2", "Section 3", "Section 4",
                  "Section 5"]

with open(filename + ".txt", "rb") as oldfile:

    licycle = cycle(dimensionsList)
    nextelem = licycle.next()
    newfile = None
    line = oldfile.readline()

    while line:

        # Case 1: Found new section
        if line.strip() == nextelem:
            if newfile is not None:
                newfile.close()
            nextelem = licycle.next()
            newfile = open(line.strip() + '.txt', 'w')

        # Case 2: Print line to current section
        elif newfile is not None:
            newfile.write(line)

        line = oldfile.readline()

如果找到Section，它就会开始写这个新文件。否则，继续写入当前文件。

Ps。：下面是我使用的示例文件：

Section 1
aaaa
bbbb
Section 2
ccc
ddd
Section 3
eee
fff
Section 4
ggg
hhh
Section 5
iii
jjj

Python - 为单个文件的每个部分编写单独的文件

1 个答案:

问题

解决方案