Question

使用python逐行查找数据拆分器

RegEx？
包含？

作为示例文件“文件”包含：

X
X
Y
Z
Z
Z

我需要一种干净的方法，可以根据字母将该文件分为3个不同的文件

作为示例：

def split_by_platform(FILE_NAME):

    with open(FILE_NAME, "r+") as infile:
        Data = infile.read()
        If the file contains "X"
            write to x.txt
        If the file contains "Y"
            write to y.txt
        If the file contains "Z"
            write to z.txt

x.txt 文件如下所示：

X
X

y.txt 文件如下所示：

z.txt 文件如下所示：

Z
Z
Z

Answer 1

感谢@bruno desthuilliers，他使我想起了前往此处的正确方法：

遍历文件对象（不是“ readlines”）：

def split_by_platform(FILE_NAME, out1, out2, out3):

    with open(FILE_NAME, "r") as infile, open(out1, 'a') as of1, open(out2, 'a') as of2, open(out3, 'a') as of3:
        for line in infile:
            if "X" in line:
                of1.write(line)
            elif "Y" in line:
                of2.write(line)
            elif "Z" in line:
                of3.write(line)

在提示@dim上进行编辑：这里是标记字符的任意长度列表的更通用方法：

def loop(infilename, flag_chars):
    with open(infilename, 'r') as infile:
        for line in infile:
            for c in flag_chars:
                if c in line:
                    with open(c+'.txt', 'a') as outfile:
                        outfile.write(line)

Answer 2

这应该做到：

with open('my_text_file.txt') as infile, open('x.txt', 'w') as x, open('y.txt', 'w') as y, open('z.txt', 'w') as z:
    for line in infile:
        if line.startswith('X'):
            x.write(line)
        elif line.startswith('Y'):
            y.write(line)
        elif line.startswith('Z'):
            z.write(line)

Answer 3

这是完成相同工作的更通用的方法：

from collections import Counter

with open("file.txt", "r+") as file:
    data = file.read().splitlines()
    counter = Counter(data)
    array2d = [[key, ] * value for key, value in counter.items()]
    print array2d # [['Y'], ['X', 'X'], ['Z', 'Z', 'Z']]
    for el in array2d:
        with open(str(el[0]) + ".txt", "w") as f:
            [f.write(e + "\n") for e in el]

上面的代码将生成带有相应值的X.txt，Y.txt和Z.txt。例如，如果您有几个C字母，则代码将生成文件C.txt。

将文本文件（逐行）拆分为不同的文件

3 个答案: