我有多个文件。每个文件的格式如下所示:
<float> <int> <stringSAME>
<float> <int> <stringSAME>
<float> <int> <string>
......
<float> <int> <stringSAME>
......
......
<float> <int> <string>
<float> <int> <stringSAME>
<float> <int> <stringSAME>
这里,第1行和第2行的字符串相同,而最后几行的字符串也相同。它表示为stringSAME。现在我想从文件的开头和结尾删除这个stringSame。但保持在stringSame之间完好无损。该过程用于具有相同格式的多个文件。 请提出一些解决方案。我使用python作为我的编程语言。
答案 0 :(得分:0)
有几种方法可以执行此操作,具体取决于您以后如何使用它。
如果您只是尝试将这些行用作DATA,我们可以这样做:
with open("path/to/file") as f:
data = (line for line in f if not line.endswith("<stringSAME>"))
如果您需要更改文件本身,我会考虑在此之后覆盖它。
with open("path/to/file","w") as f:
# opening in "w" mode BLANKS the file, so make sure
# that you've saved that original data somewhere
for line in data:
f.write(line+"\n")
您也可以将其写入文件的新副本:
with open("path/to/sanitized_file","w") as f:
for line in data:
f.write(line+"\n")
如果您尝试在多个文件上执行此操作,请先构建一个列表。
import os
list_of_files = ["file1.txt","file2.txt","file3.txt"]
for file in list_of_files:
in_file = os.path.join("path","to",file)
out_file = os.path.join("path","to","post proc",file)
# if you have to do it to a whole directory worth of files, try this instead
## import glob
## list_of_files = glob.glob("path/to/dir/*")
## for file in list_of_files:
## in_file = file
## head,tail = os.path.split(file)
## out_file = os.path.join(head,"post proc",tail)
# which simplifies a lot of the following, since none of the FileNotFoundErrors
# should ever trigger, other than the one in case the post proc directory
# doesn't exist
try:
with open(in_file, 'r') as f:
data = (line for line in f if not line.endswith("<stringSAME>"))
except IOError as e:
# log and handle error if you can't open file
except FileNotFoundError as e:
# log and handle error if the file isn't there
try:
with open(out_file,'w') as f:
for line in data:
f.write(line+"\n")
except IOError as e:
# log and handle error if you can't write to file
except FileNotFoundError as e:
# log and handle error if the directory doesn't exist