re.search->搜索字符串中的第一个外观，然后退出。

Question

我正在尝试在URL字符串上使用python正则表达式。

id= 'edu.vt.lib.scholar:http/ejournals/VALib/v48_n4/newsome.html'
>>> re.search('news|ejournals|theses',id).group()
'ejournals'
>>> re.findall('news|ejournals|theses',id)
['ejournals', 'news']

根据http://docs.python.org/2/library/re.html#finding-all-adverbs处的文档，它表示search（）与第一个匹配，并查找字符串中所有可能的匹配项。

我想知道为什么'新闻'没有被搜索捕获，即使它在模式中被声明为第一。

我使用了错误的模式吗？我想搜索字符串中是否出现任何关键字。

Answer 1

re.search()函数在满足条件的第一次出现后停止，而不是模式中的第一个选项。

Answer 2

你正在考虑倒退。正则表达式遍历目标字符串，查找"news"或"ejournals"或"theses"并返回它找到的第一个字符串。在这种情况下，"ejournals"首先出现在目标字符串中。

Answer 3

请注意搜索与 findall 之间存在其他差异，但未说明这里。例如：

python-regex why findall find nothing, but search works?

Answer 4

`id ='edu.vt.lib.scholar：http / ejournals / VALib / v48_n4 / newsome.html'

re.search（'news | ejournals | theses'，id）.group（） “电子期刊”

re.search->搜索字符串中的第一个外观，然后退出。

re.findall（'news | ejournals | theses'，id） ['ejournals'，'news']

re.findall->搜索所有匹配的字符串并以列表形式返回。

Python正则表达式 - 搜索和查找全部之间的区别

4 个答案:

re.search->搜索字符串中的第一个外观，然后退出。

re.findall->搜索所有匹配的字符串并以列表形式返回。