Question

我打算将我的一个刮刀移动到Python。我很乐意在PHP中使用preg_match和preg_match_all。我没有在Python中找到类似于preg_match的合适函数。有人可以帮我这么做吗？

例如，如果我想获取<a class="title"和</a>之间的内容，我在PHP中使用以下函数：

preg_match_all('/a class="title"(.*?)<\/a>/si',$input,$output);

而在Python中，我无法找出类似的功能。

Answer 1

你正在寻找python的re module。

查看re.findall和re.search。

正如您所提到的，您正在尝试解析html使用html parsers。 python中有几个选项，如lxml或BeautifulSoup。

看看这个Why you should not parse html with regex

Answer 2

您可能有兴趣阅读Python Regular Expression Operations

Answer 3

我认为你需要这样的事情：

output = re.search('a class="title"(.*?)<\/a>', input, flags=re.IGNORECASE)
    if output is not None:
        output = output.group(0)
        print(output)

您可以在正则表达式的开头添加（？s）以启用多行模式：

output = re.search('(?s)a class="title"(.*?)<\/a>', input, flags=re.IGNORECASE)
    if output is not None:
        output = output.group(0)
        print(output)

与PHP的preg_match相对应的Python

3 个答案: