Python正则表达式捕获组

时间:2015-03-18 19:20:03

标签: python regex

我有一个像这样的字符串

 string = "2014 Blah - Blah Blah Blah Blah-Blah Blah"

我想得到像这样的集团   (2014)(Blah) - (Blah)(Blah Blah Blah-Blah Blah) 我尝试通过此代码

这样做
pattern = "([0-9]{4})(.*)-(\w*)\s(.*)"
search_result = re.search(r'%s'%pattern,string,re.M|re.I)
if search_result:
    print search_result.groups()
else:
    print "Nothing matched"

但它正在回归

('2014', ' Blah - Blah Blah Blah Blah', 'Blah', 'Blah')

我哪里错了?

2 个答案:

答案 0 :(得分:0)

让第一组不贪心:

pattern = "([0-9]{4})(.*?)\s*-\s*(\w*)\s(.*)"

pattern = "([0-9]{4})([^-]+)\s*-\s*(\w*)\s(.*)"

答案 1 :(得分:0)

你的正则表达式有问题:

                            v--- Change it to `.*?`
pattern = "([0-9]{4})(.*)-(\w*)\s(.*)"
                       ^-- greedy you have to add `?` to make it ungreedy

你可以使用这样的正则表达式:

(\d*)\s(.*?)\s-\s(.*?)\s(.*)

<强> Working demo

enter image description here

匹配信息

MATCH 1
1.  [0-4]   `2014`
2.  [4-10]  `Blah`
3.  [12-16] `Blah`
4.  [17-41] `Blah Blah Blah-Blah Blah`