Python提取包含不包括数字字符串

时间:2016-12-12 09:10:09

标签: python regex python-2.7 python-3.x

我在此链接Python extract sentence containing word中讨论过类似的问题,但我不希望数字字符串结束句子。

例如:

The apt subtitle for the binoculars will be 9015.18.1190, CTS, which provides for binoculars. The rate of duty on this will be free.

当我尝试这个时:

import re
txt="The apt subtitle for the binoculars will be 9015.18.1190, CTS, which provides for binoculars. The rate of duty on this will be free."
define_words = 'apt subtitle'
print (re.findall(r"([^.]*?%s[^.]*\.)" % define_words,txt))

实际输出:

The apt subtitle for the binoculars will be 9015.

然而,预期的输出是:

The apt subtitle for the binoculars will be 9015.18.1190, CTS, which provides for binoculars.

有人可以帮助我实现预期的输出吗?

1 个答案:

答案 0 :(得分:0)

使用lookahead regex断言匹配的结尾与.不符合数字

这适用于您的示例输入,但可能需要稍微调整一下以处理更多案例。

import re
txt="The apt subtitle for the binoculars will be 9015.18.1190, CTS, which provides for binoculars. The rate of duty on this will be free."
define_words = 'apt subtitle'
print (re.findall(r"([^.]*?%s.*?\.)(?!\d)" % define_words,txt))
# The apt subtitle for the binoculars will be 9015.18.1190, CTS, which provides for binoculars.