带参数捕获的Python /正则表达式令牌转换

时间:2012-10-26 20:26:06

标签: python regex

我有一个字符串形式:

"Hello, this is a test. Let's tag @[William Maness], and then tag @[Another name], along with @[More Name]."

我想将其转换为...

"Hello, this is a test. Let's tag <a href='/search/william-maness'>William Maness</a>, and then tag <a href='/search/another-name'>Another name</a>, along with [...]"

我很确定这可以用正则表达式来完成,但对我来说,这对我来说太复杂了。任何帮助表示赞赏。

3 个答案:

答案 0 :(得分:2)

您可以将任何此类名称与以下内容匹配:

r'@\[([^]]+)\]'

捕获组围绕原始文本括号内的名称。

然后,您可以使用传递给sub()的函数,根据您的查询,使用链接替换名称:

def replaceReference(match):
    name = match.group(1)
    return '<a href="/search/%s">%s</a>' % (name.lower().replace(' ', '-'), name)

refs = re.compile(r'@\[([^]]+)\]')
refs.sub(replaceReference, example)

为找到的每个匹配传递一个匹配对象的函数;使用.groups(1)检索捕获组。

在此示例中,名称以非常简单的方式进行转换,但例如,如果名称存在,则可以进行实际的数据库检查。

演示:

>>> refs.sub(replaceReference, example)
'Hello, this is a test. Let\'s tag <a href="/search/william-maness">William Maness</a>, and then tag <a href="/search/another-name">Another name</a>, along with <a href="/search/more-name">More Name</a>.'

答案 1 :(得分:2)

re.sub()也接受函数,因此您可以处理替换文本:

import re

text = "Hello, this is a test. Let's tag @[William Maness], and then tag @[Another name], along with @[More Name]."

def replace(match):
    text = match.group(1)  # Extract the first capturing group

    return '<a href="/search/{0}">{1}</a>'.format(  # Format it into a link
        text.lower().replace(' ', '-'),
        text
    )

re.sub(r'@\[(.*?)\]', replace, text)

或者,如果您正在寻找可读的单行代码:

>>> import re
>>> re.sub(r'@\[(.*?)\]', (lambda m: (lambda x: '<a href="/search/{0}">{1}</a>'.format(x.lower().replace(' ', '-'), x))(m.group(1))), text)
'Hello, this is a test. Let\'s tag <a href="/search/william-maness">William Maness</a>, and then tag <a href="/search/another-name">Another name</a>, along with <a href="/search/more-name">More Name</a>.'

答案 2 :(得分:0)

使用@ Martijn的正则表达式:

>>> s
"Hello, this is a test. Let's tag @[William Maness], and then tag @[Another name], along with @[More Name]."
>>> re.sub(r'@\[([^]]+)\]', r'<a href="/search/\1</a>', s)
'Hello, this is a test. Let\'s tag <a href="/search/William Maness</a>, and then tag <a href="/search/Another name</a>, along with <a href="/search/More Name</a>.'

但是你需要使用你的用户名。