Question

我有一个包含

的字符串变量

string = "123hello456world789"

字符串不包含空格。我想写一个正则表达式，只打印包含（a-z）的单词我尝试了一个简单的正则表达式

pat = "([a-z]+){1,}"
match = re.search(r""+pat,word,re.DEBUG)

匹配对象仅包含单词Hello，单词World不匹配。

使用时re.findall()我可以同时获得Hello和World。

我的问题是为什么我们不能使用re.search()？

执行此操作

re.search()怎么做？

Answer 1

re.search()在字符串documenation中找到一次模式：

扫描字符串，查找常规位置表达式模式产生匹配，并返回相应的 MatchObject实例。如果字符串中没有位置匹配则返回None 模式;请注意，这与查找零长度不同匹配字符串中的某个点。

为了匹配每次出现，您需要re.findall()，documentation：

返回字符串中所有非重叠的模式匹配，作为列表字符串。从左到右扫描字符串，并返回匹配项按顺序找到。如果模式中存在一个或多个组，返回一个组列表;如果模式，这将是一个元组列表有不止一个团体。结果中包含空匹配除非他们触及另一场比赛的开始。

示例：

>>> import re >>> regex = re.compile(r'([a-z]+)', re.I) >>> # using search we only get the first item. >>> regex.search("123hello456world789").groups() ('hello',) >>> # using findall we get every item. >>> regex.findall("123hello456world789") ['hello', 'world']

<强>更新

由于your duplicate question（as discussed at this link），我在此处添加了其他答案：

>>> import re >>> regex = re.compile(r'([a-z][a-z-\']+[a-z])') >>> regex.findall("HELLO W-O-R-L-D") # this has uppercase [] # there are no results here, because the string is uppercase >>> regex.findall("HELLO W-O-R-L-D".lower()) # lets lowercase ['hello', 'w-o-r-l-d'] # now we have results >>> regex.findall("123hello456world789") ['hello', 'world']

正如您所看到的，您提供的第一个示例失败的原因是因为大写，您可以简单地添加re.IGNORECASE标志，但您提到匹配应该只是小写。

Answer 2

@InbarRose答案显示为什么re.search以这种方式工作，但如果您想要match个对象而不仅仅是re.findall的字符串输出，请使用re.finditer

>>> for match in re.finditer(pat, string):
...     print match.groups()
...
('hello',)
('world',)
>>>

或者如果您想要list

>>> list(re.finditer(pat, string))
[<_sre.SRE_Match object at 0x022DB320>, <_sre.SRE_Match object at 0x022DB660>]

使用string作为变量名称通常也是一个坏主意，因为它是一个通用模块。

Python re.search

2 个答案: