我的输入文件如下所示
car
dog
Rock
我尝试编辑的输出文件如下所示。我的全部目标是删除所有包含单词car
的行cat car
sky rat
car cloud
这是我的初始代码,这里的问题是它只删除了只在字面上只有“car”这个词的行
from __future__ import print_function
import linecache
import fileinput
must_delete = linecache.getline('Test.txt', 1)
for line in fileinput.input('output.txt', inplace=True):
if line != must_delete:
print(line, end='')
答案 0 :(得分:1)
from __future__ import print_function
import re
import linecache
import fileinput
must_delete = "car" # linecache.getline('Test.txt', 1)
text = '''
cat car g
sky rat
car cloud
scary thing
'''
with open("cleaned_file.txt","w") as clean:
for line in text.splitlines() : # fileinput.input('output.txt', inplace=True):
if re.search(r"(\b"+must_delete+r"\b)", line, flags=re.IGNORECASE):
print ("deleting line:"+ line)
else:
print ("this line has to be kept in the output: " + line)
clean.write(line+"\n")
# cleaned_file.txt has all the needed lines
输出:
this line has to be kept in the output:
deleting line:cat car g
this line has to be kept in the output: sky rat
deleting line:car cloud
this line has to be kept in the output: scary thing
我使用了一个由你要删除的单词组成的正则表达式和两个单词边界,因此汽车必须是一个完整的单词。如果找不到正则表达式,re.search()
会返回None
。
正如评论中指出的那样,“可怕”也包含“汽车” - 这就是简单的if "car" in "scary":
不足以清除包含“汽车”但不是“汽车”的单词的原因。