我有这样的文件..
xxxxxxxxxxxxxxx
xxxxxxxxxxxxxxx
xxxxxxxxxxxxxxx
a b c invalid #seperated by tab
xxxxxxxxxxxxxxx
xxxxxxxxxxxxxxx
我需要将a b c invalid
替换为a b reviewed rd # separated by tab
基本上任何以无效结尾的行,我需要用reviewed rd // separated by tab
替换该行,但我必须在该行上保留第一个和第二个单词(仅替换第3个和第4个)。
我已经开始做这样的事了,但这并不完全符合我的要求。
f1 = open('fileInput', 'r')
f2 = open('fileInput'+".tmp", 'w')
for line in f1:
f2.write(line.replace('invalid', ' reviewed'+\t+'rd'))
f1.close()
f2.close()
regex
可以是一个选项,但我还没那么好。有人可以帮忙吗
P.S。 a,b和c是变量..我无法对'a','b','c'进行精确搜索。
答案 0 :(得分:2)
f1 = open('fileInput', 'r')
f2 = open('fileInput+".tmp"', 'w')
for line in f1:
if line[:-1].endswith("invalid"):
f2.write("\t".join(line.split("\t")[:2] + ["reviewed", "rd"]) + "\n")
else:
f2.write(line)
f1.close()
f2.close()
答案 1 :(得分:2)
import re
pattern = re.compile(r'\t\S+\tinvalid$')
with open('data') as fin:
with open('output', 'w') as fout:
for line in fin:
fout.write(pattern.sub('\treviewd\trd', line))
答案 2 :(得分:1)
with open('input.tab') as fin, open('output.tab', 'wb') as fout:
tabin = csv.reader(fin, delimiter='\t')
tabout = csv.writer(fout, delimiter='\t')
for row in tabin:
if len(tabin) != 4:
continue # or raise - whatever
if row[-1] == 'invalid':
tabout.writerow(row[:2] + ['reviewed', 'rd'])