我正在使用python 2.7,我正在尝试将某个字符串与此结构匹配:
INPUT = 'abc-1-2 abc-2-3 abc-1-1 - TYP1 xyz-2-3 xyzzz - TYP2 ooop-1-1 abc-3-3 bbb - TYP3'
EXPECTED_OUTPUT = [
'abc-1-2 abc-2-3 abc-1-1 - TYP1',
'xyz-2-3 xyzzz - TYP2',
'ooop-1-1 abc-3-3 bbb - TYP3']
这是我尝试的解决方案,但它不起作用: 的 Online Demo
答案 0 :(得分:1)
我认为这就是你要找的东西:
".+?TYP\d+"
答案 1 :(得分:1)
以下 regex 应该这样做:
\b.*?-\s.*?(?:\s|$)
<强>蟒强>
import re
regex = ur"\b.*?-\s.*?(?:\s|$)"
str = "abc-1-2 abc-2-3 abc-1-1 - TYP1 xyz-2-3 xyzzz - TYP2 ooop-1-1 abc-3-3 bbb - TYP3"
matches = re.finditer(regex, str)
for matchNum, match in enumerate(matches):
matchNum = matchNum + 1
print ("{match}".format(matchNum = matchNum, start = match.start(), end = match.end(), match = match.group()))
答案 2 :(得分:1)
>>> re.findall(r'\S.*? - \S+', INPUT)
['abc-1-2 abc-2-3 abc-1-1 - TYP1', 'xyz-2-3 xyzzz - TYP2', 'ooop-1-1 abc-3-3 bbb - TYP3']
说明:
'\S' # any non-space character
'.*?' # (.) any character (*) zero or more times (?) non-greedy (match as few as possible)
' - ' # literal space dash space
'\S' # any non-space character
'+' # one or more times
答案 3 :(得分:0)
我唯一看到的是与中心相匹配的完整/破碎序列
[^\s-]+(?:-?[^\s-])*(?:\s+[^\s-]+(?:-?[^\s-])*)*\s+-\s+[^\s-]+
[^\s-]+ # Unbroken sequence XXX-XXX-XXX
(?:
-?
[^\s-]
)*
(?: # Optional sequence <space> XXX-XXX-XXX
\s+
[^\s-]+
(?:
-?
[^\s-]
)*
)*
# Broken sequence <space> - <space> XXX
\s+ # Space
- # Dash
\s+ # Space
[^\s-]+ # XXX
输出
** Grp 0 - ( pos 0 , len 30 )
abc-1-2 abc-2-3 abc-1-1 - TYP1
** Grp 0 - ( pos 31 , len 20 )
xyz-2-3 xyzzz - TYP2
** Grp 0 - ( pos 52 , len 27 )
ooop-1-1 abc-3-3 bbb - TYP3
答案 4 :(得分:0)
最简单的是:
import re
string = "abc-1-2 abc-2-3 abc-1-1 - TYP1 xyz-2-3 xyzzz - TYP2 ooop-1-1 abc-3-3 bbb - TYP3"
rx = re.compile(r'(.+?TYP\d)\s*')
parts = rx.findall(string)
print(parts)
# ['abc-1-2 abc-2-3 abc-1-1 - TYP1', 'xyz-2-3 xyzzz - TYP2', 'ooop-1-1 abc-3-3 bbb - TYP3']