Python复制txt行以2位数模式开始,直到此模式再次出现,但不包括下一个出现的行

时间:2017-01-23 11:50:38

标签: python

我的文字文件如下:

0211111
aaaaaaaa
bbbbbbbb
ccccccccc
02333333
ddddddd
eeeeeeeee
fffffff
02444444
ggggggg
fffffff
jjjjjjjj
0211111
kkkkkkkk
llllllll
mmmmmmm
02333333
ggggggg
fffffff
jjjjjjjj

我读取以02开头的文字行,我想复制到3个新文件(0211111.txt02333333.txt02444444.txt)每行开始 021*023*024*,直到模式发生02再次出现,不包括此行,复制到第[i-1]行。

生成的文件输出将如下所示。

0211111.txt

0211111
aaaaaaaa
bbbbbbbb
ccccccccc
0211111
kkkkkkkk
llllllll
mmmmmmm

02333333.txt

02333333
ddddddd
eeeeeeeee
fffffff
02333333
ddddddd
eeeeeeeee
fffffff

02444444.txt

02444444
ggggggg
fffffff
jjjjjjjj

我在下面写了python脚本,但它没有按预期工作,因为它采用023*模式下面的行并复制到新文件0211111.txt

f1 = open("C:\\..\\..merge_d.txt")
f2 = open("C:\\..\\..\\newf_021.txt", 'a')
f3 = open("C:\\..\\..\\newf_023.txt", 'a')
f4 = open("C:\\..\\..\\newf_024.txt", 'a')
f2.truncate(0)
f3.truncate(0)
f4.truncate(0)


global lines_nums_021
lines_nums_021 = []
global i_x_021
i_x_021 = 0
cache_021 = []
output_data_021 = []
for i,line in enumerate(lines,1):
    find_n = line.startswith("021")
    lines_nums_n.append((find_021, i))
    # lines_nums_M.append((find_023, i))
    # lines_nums_C.append((find_024, i))
i_x_021=next(v[1] for v in lines_nums_n if v[0] is True)
# i_x_M=next(v[1] for v in lines_nums_023 if v[0] is True)
# i_x_C=next(v[1] for v in lines_nums_024 if v[0] is True)
for i,line in enumerate(lines,1):
    if line.startswith("021"):
        cache_021.append(line)
    elif not line.startswith("021") and i >= i_x_021:
        output_data_021.extend(cache_021)
        output_data_021.append(line) 
        cache_021 = []
for item in output_data_n:
    f2.write("%s" % item)
print (i_x_021)

2 个答案:

答案 0 :(得分:0)

我不确定你为什么要做所有的缓存等,但这应该可以解决问题:

import os
files_directory = r"."
in_file_path = os.path.join(files_directory, 'merged.txt')
with open(in_file_path) as in_file:
    out_file = None
    for line in in_file:
        if line.startswith('02'):
            # open new file for writing
            if out_file:
                out_file.close();
            out_file = open(os.path.join(files_directory, line.strip() + '.txt'), 'a')
            continue
        if out_file:
            out_file.write(line)
if out_file:
    out_file.close()

答案 1 :(得分:0)

希望这会有所帮助。下面的代码为每个要写出的文件创建一个字典,匹配模式作为键。然后根据需要在文件指针之间切换。

import re

# open text file for reading
fp = open('merge_d.txt')

# open 3 files for writing
f21 = open('0211111.txt', 'w')
f23 = open('0233333.txt', 'w')
f24 = open('0244444.txt', 'w')

file_patterns = {'021*': f21, '023*': f23, '024*': f24}

# f_out is a variable that will switch between
# file objects
f_out = None

for line in fp:
    # check each line for a match to the '02...' patterns
    pattern_check = [p for p in file_patterns if re.match(p, line.strip())]
    if any(pattern_check):
        f_out = file_patterns[pattern_check[0]]
    if f_out:
        f_out.write(line)

f21.close()
f23.close()
f24.close()