我在此链接Python extract sentence containing word中讨论过类似的问题,但我不希望数字字符串结束句子。
例如:
The apt subtitle for the binoculars will be 9015.18.1190, CTS, which provides for binoculars. The rate of duty on this will be free.
当我尝试这个时:
import re
txt="The apt subtitle for the binoculars will be 9015.18.1190, CTS, which provides for binoculars. The rate of duty on this will be free."
define_words = 'apt subtitle'
print (re.findall(r"([^.]*?%s[^.]*\.)" % define_words,txt))
实际输出:
The apt subtitle for the binoculars will be 9015.
然而,预期的输出是:
The apt subtitle for the binoculars will be 9015.18.1190, CTS, which provides for binoculars.
有人可以帮助我实现预期的输出吗?
答案 0 :(得分:0)
使用lookahead regex断言匹配的结尾与.
不符合数字
这适用于您的示例输入,但可能需要稍微调整一下以处理更多案例。
import re
txt="The apt subtitle for the binoculars will be 9015.18.1190, CTS, which provides for binoculars. The rate of duty on this will be free."
define_words = 'apt subtitle'
print (re.findall(r"([^.]*?%s.*?\.)(?!\d)" % define_words,txt))
# The apt subtitle for the binoculars will be 9015.18.1190, CTS, which provides for binoculars.