我的文字文件如下:
0211111
aaaaaaaa
bbbbbbbb
ccccccccc
02333333
ddddddd
eeeeeeeee
fffffff
02444444
ggggggg
fffffff
jjjjjjjj
0211111
kkkkkkkk
llllllll
mmmmmmm
02333333
ggggggg
fffffff
jjjjjjjj
我读取以02开头的文字行,我想复制到3个新文件(0211111.txt
,02333333.txt
,02444444.txt
)每行开始
021*
,023*
,024*
,直到模式发生02
再次出现,不包括此行,复制到第[i-1]
行。
生成的文件输出将如下所示。
0211111.txt
0211111
aaaaaaaa
bbbbbbbb
ccccccccc
0211111
kkkkkkkk
llllllll
mmmmmmm
02333333.txt
02333333
ddddddd
eeeeeeeee
fffffff
02333333
ddddddd
eeeeeeeee
fffffff
02444444.txt
02444444
ggggggg
fffffff
jjjjjjjj
我在下面写了python脚本,但它没有按预期工作,因为它采用023*
模式下面的行并复制到新文件0211111.txt
。
f1 = open("C:\\..\\..merge_d.txt")
f2 = open("C:\\..\\..\\newf_021.txt", 'a')
f3 = open("C:\\..\\..\\newf_023.txt", 'a')
f4 = open("C:\\..\\..\\newf_024.txt", 'a')
f2.truncate(0)
f3.truncate(0)
f4.truncate(0)
global lines_nums_021
lines_nums_021 = []
global i_x_021
i_x_021 = 0
cache_021 = []
output_data_021 = []
for i,line in enumerate(lines,1):
find_n = line.startswith("021")
lines_nums_n.append((find_021, i))
# lines_nums_M.append((find_023, i))
# lines_nums_C.append((find_024, i))
i_x_021=next(v[1] for v in lines_nums_n if v[0] is True)
# i_x_M=next(v[1] for v in lines_nums_023 if v[0] is True)
# i_x_C=next(v[1] for v in lines_nums_024 if v[0] is True)
for i,line in enumerate(lines,1):
if line.startswith("021"):
cache_021.append(line)
elif not line.startswith("021") and i >= i_x_021:
output_data_021.extend(cache_021)
output_data_021.append(line)
cache_021 = []
for item in output_data_n:
f2.write("%s" % item)
print (i_x_021)
答案 0 :(得分:0)
我不确定你为什么要做所有的缓存等,但这应该可以解决问题:
import os
files_directory = r"."
in_file_path = os.path.join(files_directory, 'merged.txt')
with open(in_file_path) as in_file:
out_file = None
for line in in_file:
if line.startswith('02'):
# open new file for writing
if out_file:
out_file.close();
out_file = open(os.path.join(files_directory, line.strip() + '.txt'), 'a')
continue
if out_file:
out_file.write(line)
if out_file:
out_file.close()
答案 1 :(得分:0)
希望这会有所帮助。下面的代码为每个要写出的文件创建一个字典,匹配模式作为键。然后根据需要在文件指针之间切换。
import re
# open text file for reading
fp = open('merge_d.txt')
# open 3 files for writing
f21 = open('0211111.txt', 'w')
f23 = open('0233333.txt', 'w')
f24 = open('0244444.txt', 'w')
file_patterns = {'021*': f21, '023*': f23, '024*': f24}
# f_out is a variable that will switch between
# file objects
f_out = None
for line in fp:
# check each line for a match to the '02...' patterns
pattern_check = [p for p in file_patterns if re.match(p, line.strip())]
if any(pattern_check):
f_out = file_patterns[pattern_check[0]]
if f_out:
f_out.write(line)
f21.close()
f23.close()
f24.close()