Question

如何重写此new way to recognise地址以使用Python？

\b(([\w-]+://?|www[.])[^\s()<>]+(?:\([\w\d]+\)|([^[:punct:]\s]|/)))

Answer 1

original source表示“此模式应该适用于大多数现代正则表达式实现”，特别是Perl。 Python的正则表达式实现是现代的similar to Perl's但缺少[:punct:]字符类。您可以使用以下方法轻松构建：

>>> import string, re
>>> pat = r'\b(([\w-]+://?|www[.])[^\s()<>]+(?:\([\w\d]+\)|([^%s\s]|/)))'
>>> pat = pat % re.sub(r'([-\\\]])', r'\\\1', string.punctuation)

re.sub()调用会转义字符集as required内的某些字符。

编辑：使用re.escape（）也可以，因为它只是在所有内容前面加上反斜杠。起初这对我来说很粗糙，但对于这种情况肯定会很好。

>>> pat = pat % re.escape(string.punctuation)

Answer 2

我认为python没有这个表达式

[:punct:]

Wikipedia说[:punct:]与

相同

[-!\"#$%&\'()*+,./:;<=>?@\\[\\\\]^_`{|}~]

Answer 3

Python没有POSIX bracket expressions。

[:punct:]括号表达式在ASCII到

中是等效的

[!"#$%&'()*+,\-./:;<=>?@[\\\]^_`{|}~]

Gruber的Python中的URL正则表达式

3 个答案: