我有一个包含425个类似文件列表的文件夹,名为“00001q1.txt,00002w2.txt,00003e3.txt ... 00425q1.txt”。每个文件包含两行之间的一行文本。这些行在所有文件中都是常量。我需要提取这些行并将其作为行列保存到输出文件中。
这是能够循环文件夹中所有文件的脚本,但它不会将文件列表中的所需行提取到otput文件。
#!/usr/bin/python
# Open a file
import re
import os
import sys
import glob
outfile = open("list7.txt", "w")
# This would print all the files and directories (in sorted order)
full_path = r"F:\files\list"
filelist = sorted(os.listdir( full_path ))
print filelist
# This would scan the filelist and extract desired line that located between two rovs:
# 00001q1.txt:
# Row above line
# line
# Row under line
buffer = []
for line in filelist:
if line.startswith("Row above line"):
buffer = ['']
elif line.startswith("Row under line"):
outfile.write("".join(buffer))
buffer = []
elif buffer:
buffer.append(line)
# infile.close()
outfile.close()
如果我在脚本中定义单个文件(例如00001q1.txt“)而不是filelist,那么所需的行将成功写入outfile。该怎么做的脚本扫描文件列表?
提前致谢。
答案 0 :(得分:1)
如果我理解得很好,你想写list7.txt
所有必需的事件:
import os
outfile = open("list7.txt", "w")
full_path = r"F:\files\list"
filelist = sorted(os.listdir(full_path))
with open("list7.txt", "w") as outfile:
buffer = []
for filename in filelist:
with open(os.path.join(full_path, filename), "r") as infile:
for line in infile.readlines():
if line.startswith("Row above line"):
buffer = ['']
elif line.startswith("Row under line"):
outfile.write("".join(buffer))
buffer = []
elif buffer:
buffer.append(line)
for line in buffer:
outfile.write(line)
答案 1 :(得分:1)
您需要在每个文件中迭代文件和行:
buffer = []
for fileName in filelist:
with open(fileName, 'rU') as f:
for line in f:
if line.startswith("Row above line"):
buffer = ['']
elif line.startswith("Row under line"):
outfile.write("".join(buffer))
buffer = []
elif buffer:
buffer.append(line)