我有一个像这样的字符串
string = "2014 Blah - Blah Blah Blah Blah-Blah Blah"
我想得到像这样的集团 (2014)(Blah) - (Blah)(Blah Blah Blah-Blah Blah) 我尝试通过此代码
这样做pattern = "([0-9]{4})(.*)-(\w*)\s(.*)"
search_result = re.search(r'%s'%pattern,string,re.M|re.I)
if search_result:
print search_result.groups()
else:
print "Nothing matched"
但它正在回归
('2014', ' Blah - Blah Blah Blah Blah', 'Blah', 'Blah')
我哪里错了?
答案 0 :(得分:0)
让第一组不贪心:
pattern = "([0-9]{4})(.*?)\s*-\s*(\w*)\s(.*)"
或
pattern = "([0-9]{4})([^-]+)\s*-\s*(\w*)\s(.*)"
答案 1 :(得分:0)
你的正则表达式有问题:
v--- Change it to `.*?`
pattern = "([0-9]{4})(.*)-(\w*)\s(.*)"
^-- greedy you have to add `?` to make it ungreedy
你可以使用这样的正则表达式:
(\d*)\s(.*?)\s-\s(.*?)\s(.*)
<强> Working demo 强>
匹配信息
MATCH 1
1. [0-4] `2014`
2. [4-10] `Blah`
3. [12-16] `Blah`
4. [17-41] `Blah Blah Blah-Blah Blah`