匹配多个单词正则表达式,python

时间:2014-03-05 05:06:07

标签: python regex

我需要匹配字符串中的模式。字符串是可变的,所以我需要在其中产生一些变化。
我需要做的是提取出现在" layout"它们以4种不同的方式出现

1 word -- layout` eg: hsr layout

2words -- layout eg: golden garden layout

digit-word -- layout eg: 19th layout

digit-word word --layout eg:- 20th garden layout

可以看出,我需要数字字段是可选的。一个正则表达式必须这样做。这就是我的所作所为:

import re
p = re.compile(r'(?:\d*)?\w+\s(?:\d*)?\w+l[ayout]*')
text = "opp when the 19th hsr layut towards"
q = re.findall(p,text)

我需要在此表达式中使用第19个hsr布局。但上面的代码没有返回。上面的代码有什么问题?

一些字符串示例是:

str1 = " 25/4 16th june road ,watertank layout ,blr"  #extract watertank layout 
str2 = " jacob circle 16th rusthumbagh layout , 5th cross" #extract 16th rustumbagh layout
str3 = " oberoi splendor garden blossoms layout , 5th main road"  #extract garden blossoms layout
str4 = " belvedia heights , 15th layout near Jaffrey gym" #extract 15th layout

2 个答案:

答案 0 :(得分:2)

我评论时使用r'(?:\w+\s+){1,2}layout'

>>> import re
>>> p = re.compile(r'(?:\w+\s+){1,2}layout')
>>> p.findall(" 25/4 16th june road ,watertank layout ,blr")
['watertank layout']
>>> p.findall(" jacob circle 16th rusthumbagh layout , 5th cross")
['16th rusthumbagh layout']
>>> p.findall(" oberoi splendor garden blossoms layout , 5th main road")
['garden blossoms layout']
>>> p.findall(" belvedia heights , 15th layout near Jaffrey gym")
['15th layout']

{1,2}用于匹配最多2个单词。

答案 1 :(得分:1)

这似乎有效 -

import re

l = [" 25/4 16th june road ,watertank layout ,blr",
" jacob circle, 16th rusthumbagh layout , 5th cross",
" oberoi splendor , garden blossoms layout , 5th main road",
" belvedia heights , 15th layout near Jaffrey gym",]

for ll in l:
    print re.search(r'\,([\w\s]+)layout', ll).groups()

输出:

('watertank ',)
(' 16th rusthumbagh ',)
(' garden blossoms ',)
(' 15th ',)