我正在使用正则表达式
[^A-Za-z](email,|help|BGN|won't|go|corner|issues|disconected|We|group|No|send|Bv|connecting|has|Pittsburgh,|Many|(Akustica,|Toluca|cannot|Restarting|they|not|PI2|one|condition|entire|LAN|experincing|bar|Exchange,|server|Are|PA)|OutLook|right|says|Rose|Montalvo|back|computer|are|Jane|thier|Disconnected|Nrd|and/or|network|for|Appears|e-mail|unable|Connected|then|Broadview,|issue|email|shows|available|be|we|exchange|error|address|based|My|Microsoft|received|working|created|receive|impacted|WIFI|through|connection|including|or|IL|outlook|via|facility|Everyone's|servers|Also|message|"The|your|Status|doesn't|service|SI-MBX82.de.bosch.com,|next|appears|"disconnected"|Encryption|eMail/file|today|"Waiting|"send/receive"|but|it|trying|SAP|disconnected|e-mails|this|getting|can|of|connect|Incorrect|manually|is|site|an|folder"|cant|Other|have|in|Receiving|if|Plant|no|SI-MBX80.de.bosch.com|that|when|online|persists."|Customer|administrator|users|update|applications|"Disconnected"|SI-MBX81.de.bosch.com|The|on|lower|Some|It|contact|In|the|having)[^A-Za-z]
并申请,但无法在句子
中找到"Jane"
"Issue with eMail/file Encryption Incorrect email address created for Jane Rose Montalvo."
虽然Jane出现在我正在使用的上述模式中。
可能是什么原因?
答案 0 :(得分:2)
问题是你的正则表达式在单词之前和之后捕获\s
,它也是匹配条件。
Hello Jane
因此,一旦Hello
被捕获,Jane
就会被遗忘,并且它无法匹配,因为它之前没有空格。您应该将其设为断言而不是匹配。
使用(?< = [^ a-zA-Z])而不是简单[^ a-zA-Z]。参见演示。
答案 1 :(得分:2)
由于字符重叠。只需在前瞻中使用捕获组以捕获重叠的字符,
(?=[^A-Za-z](email,|help|BGN|won't|go|corner|issues|disconected|We|group|No|send|Bv|connecting|has|Pittsburgh,|Many|(Akustica,|Toluca|cannot|Restarting|they|not|PI2|one|condition|entire|LAN|experincing|bar|Exchange,|server|Are|PA)|OutLook|right|says|Rose|Montalvo|back|computer|are|Jane|thier|Disconnected|Nrd|and/or|network|for|Appears|e-mail|unable|Connected|then|Broadview,|issue|email|shows|available|be|we|exchange|error|address|based|My|Microsoft|received|working|created|receive|impacted|WIFI|through|connection|including|or|IL|outlook|via|facility|Everyone's|servers|Also|message|"The|your|Status|doesn't|service|SI-MBX82\.de\.bosch\.com,|next|appears|"disconnected"|Encryption|eMail/file|today|"Waiting|"send/receive"|but|it|trying|SAP|disconnected|e-mails|this|getting|can|of|connect|Incorrect|manually|is|site|an|folder"|cant|Other|have|in|Receiving|if|Plant|no|SI-MBX80\.de\.bosch\.com|that|when|online|persists\."|Customer|administrator|users|update|applications|"Disconnected"|SI-MBX81\.de\.bosch.com|The|on|lower|Some|It|contact|In|the|having)[^A-Za-z])
答案 2 :(得分:0)
如果由于某种原因你不能或不想修改你的模式并且你想要捕获重叠的匹配,你可以在循环中使用re.search
- 将搜索的起点移动到角色就在上一场比赛开始之后。
#recursive
def foo(s, p, start = 0):
m = p.search(s, start)
if not m:
return ''
return m.group() + foo(s, p, m.start() + 1)
#iterative
def foo1(s, p):
result = ''
m = p.search(s, 0)
while m:
result += m.group()
m = p.search(s, m.start() + 1)
return result
print foo(s, re.compile(p))
print foo1(s, re.compile(p))
>>>
eMail/file Encryption Incorrect email address created for Jane Rose Montalvo.
eMail/file Encryption Incorrect email address created for Jane Rose Montalvo.
>>>