我有一长串文本,其中包含要基于部分匹配(90%)删除的子文本。
string = "Adam is a boy who lives in Michigan.
He loves to eat apples and oranges.
He also enjoys playing with his dog and cat.
Adam is a happy boy."
substring = "He loves to apple oranges"
我想回来
"Adam is a boy who lives in Michigan.
He also enjoys playing with his dog and cat.
Adam is a happy boy."
在子字符串中没有出现“吃”和“和”这两个字,但我想删除整个句子“他喜欢吃苹果和橘子”。我不太确定该怎么做。谢谢!
答案 0 :(得分:4)
您可以使用difflib.SequenceMatcher
:
from difflib import SequenceMatcher
'\n'.join(s for s in string.splitlines() if SequenceMatcher(' '.__eq__, s, substring).ratio() < 0.6)
这将返回:
Adam is a boy who lives in Michigan.
He also enjoys playing with his dog and cat.
Adam is a happy boy.
答案 1 :(得分:0)
string = string.replace(substring,'')
这会将字符串中的子字符串替换为空(""
)