Question

我试图使用正则表达式解析bbcode，到目前为止我可以使这个正则表达式工作正常

if re.search("(\[b\])", m, re.IGNORECASE):
    r = re.compile(r"\[b\](?P<name>.*?)\[\/b\]", re.IGNORECASE)
    m = r.sub(r'<b>\1</b>', m)

但是在这种情况下，我需要使用多个常规括号来捕获字体的样式以及包含在字体bbcode中的内容，例如

[f color="#fff" ...]string[/f]

，我无法正常工作，因为输出总是像这样结束

string</font>

这是我的正则表达式代码。我不知道我在这里做错了什么..

if re.search("(\[f .*?\])", m, re.IGNORECASE):
    r = re.compile(r"\[f (?P<tag>.*?)\](?P<name>.*?)\[\/f\]", re.IGNORECASE)
    m = r.sub(r'<font \g<tag>>\g<name></font>', m)

Answer 1

丹尼尔，看着你的模型代码，你正在寻找这样的东西：

result = re.sub(r"\[f ([^\]]*)\]([^\[]*)\[/[^\]]*\]", r"<font \1>\2</font>", subject)

使用[f color="#fff" ...]string[/f]作为输入，输出为<font color="#fff" ...>string</font>。当然这不是有效的html，但这就是你的代码试图做的事情，你可以从这里轻松调整它以完全按你喜欢的方式进行替换。

解释正则表达式

\[                       # '['
f                        # 'f '
(                        # group and capture to \1:
  [^\]]*                 #   any character except: '\]' (0 or more
                         #   times)
)                        # end of \1
\]                       # ']'
(                        # group and capture to \2:
  [^\[]*                 #   any character except: '\[' (0 or more
                         #   times)
)                        # end of \2
\[                       # '['
/                        # '/'
[^\]]*                   # any character except: '\]' (0 or more
                         # times)
\]                       # ']'

Answer 2

尝试使用此程序包https://pypi.python.org/pypi/bbcode

自己编写代码可能不是一个好主意。

使用多个常规括号来解析bbcode

2 个答案: