删除包含python中某个单词的文件中的行

时间:2018-01-05 17:53:25

标签: python arrays python-2.7 sorting

我的输入文件如下所示

car
dog
Rock

我尝试编辑的输出文件如下所示。我的全部目标是删除所有包含单词car

的行
cat car
sky rat
car cloud

这是我的初始代码,这里的问题是它只删除了只在字面上只有“car”这个词的行

from __future__ import print_function
import linecache
import fileinput

must_delete = linecache.getline('Test.txt', 1)

for line in fileinput.input('output.txt', inplace=True):
    if line != must_delete:
        print(line, end='')

1 个答案:

答案 0 :(得分:1)

from __future__ import print_function
import re
import linecache
import fileinput

must_delete = "car" # linecache.getline('Test.txt', 1)

text = '''
cat car g
sky rat
car cloud
scary thing
''' 

with open("cleaned_file.txt","w") as clean:
    for line in text.splitlines() :          # fileinput.input('output.txt', inplace=True):
        if  re.search(r"(\b"+must_delete+r"\b)", line, flags=re.IGNORECASE):
            print ("deleting line:"+ line)
        else:
            print ("this line has to be kept in the output: " + line)
            clean.write(line+"\n")

# cleaned_file.txt has all the needed lines

输出:

this line has to be kept in the output: 
deleting line:cat car g
this line has to be kept in the output: sky rat
deleting line:car cloud
this line has to be kept in the output: scary thing

我使用了一个由你要删除的单词组成的正则表达式和两个单词边界,因此汽车必须是一个完整的单词。如果找不到正则表达式,re.search()会返回None

正如评论中指出的那样,“可怕”也包含“汽车” - 这就是简单的if "car" in "scary":不足以清除包含“汽车”但不是“汽车”的单词的原因。