Question

我正在尝试在Job POS和工作角色之间获取文本。我需要在一个变量中得到它。

import re
req_id_num = """Job POS: -PLEASE MAKE SURE YOU ARE GOOD.

-LOOKING FOR CONTRACTOR WHO IS STRONG IN LIFTING.

-LOOKING FOR SOMEONE WHO IS PROFICIENT IN AT THE EXECUTIVE LEVEL.

-Looking for more of a financial background than accounting background, working role"""

 Req_Job_description = re.search(r'Job POS: -(.*?) working role',
 req_id_num).group(0)

 if Req_Job_description:    
     print "search -->searchObj.group():",Req_Job_description
 else:    print "Nothing found!!"

运行此

时出现以下错误

Req_Job_description = re.search(r'Job POS: -(.*?) working role', req_id_num).group(0)
AttributeError: 'NoneType' object has no attribute 'group'

Answer 1

为什么不避免使用正则表达式（大多数情况下应该使用正则表达式）并使用slicing代替？

description = text[len("Job POS: "):-len(" working role")]

根据它们的长度切掉前缀和后缀。

Answer 2

此操作失败，因为您的搜索文本包含换行符。点字符（默认情况下）匹配除换行符之外的任何内容。您需要包含re.DOTALL标志才能更改此行为。一个简化的例子：

>>> import re
>>> pat = re.compile("INTRO: (.*?) TRAILER")
>>> m = pat.search("INTRO: this is data TRAILER")
>>> m.group(1)
'this is data'
>>> m = pat.search("INTRO: this \nis\n data TRAILER")
>>> m
>>> # m is None -- no Match object was returned.

使用DOTALL标志重试：

>>> pat = re.compile("INTRO: (.*?) TRAILER", re.DOTALL)
>>> m = pat.search("INTRO: this \nis\n data TRAILER")
>>> m
<_sre.SRE_Match object at 0x106b87738>
>>> m.group(1)
'this \nis\n data'

Python无法使用模式匹配获取所需的文本

2 个答案: