Question

原始

我想解析一串html代码，并在初始表单标记之后关闭标记后添加换行符。这是迄今为止的代码。它在“re.sub”行中给了我一个错误。我不明白为什么正则表达式失败了。

def user(): 
    tags = "<form><label for=\"email_field\">Email:</label><input type=\"email\" name=\"email_field\"/><label for=\"password_field\">Password:</label><input type=\"password\" name=\"password_field\"/><input type=\"submit\" value=\"Login\"/></form>"
    result = re.sub("(</.*?>)", "\1\n", tags)
    return dict(form_code=result)

PS。我觉得这可能不是最好的方式......但我仍然想学习如何做到这一点。

的修改

我错过了default.py中的“import re”。感谢ruakh。

import re

现在我的页面源代码显示如下（在客户端浏览器中检查）。实际页面将表单代码显示为文本，而不是UI元素。

&lt;form&gt;&lt;label for=&quot;email_field&quot;&gt;Email:&lt;/label&gt;
&lt;input type=&quot;email&quot; name=&quot;email_field&quot;/&gt;&lt;label     
for=&quot;password_field&quot;&gt;Password:&lt;/label&gt;
&lt;input type=&quot;password&quot; name=&quot;password_field&quot;/&gt;&lt;input   
type=&quot;submit&quot; value=&quot;Login&quot;/&gt;&lt;/form&gt;

编辑2

在将default（）helper添加到default.py之后，表单代码将呈现为UI元素。谢谢安东尼的帮助。更正如下：

return dict(form_code=XML(result))

最终编辑

修复正则表达式我认为自己。这不是最佳解决方案，但至少它是有效的。最终代码：

import re
def user(): 
    tags = "<form><label for=\"email_field\">Email:</label><input type=\"email\" name=\"email_field\"/><label for=\"password_field\">Password:</label><input type=\"password\" name=\"password_field\"/><input type=\"submit\" value=\"Login\"/></form>"
    tags = re.sub(r"(<form>)", r"<form>\n  ", tags)
    tags = re.sub(r"(</.*?>)", r"\1\n  ", tags)
    tags = re.sub(r"(/>)", r"/>\n  ", tags)
    tags = re.sub(r"(  </form>)", r"</form>\n", tags)
    return dict(form_code=XML(tags))

Answer 1

我看到的唯一问题是您需要将"\1\n"更改为r"\1\n"（使用“原始”字符串表示法）;否则\1被解释为八进制转义（意味着字符U + 0001）。但这本身不应该给你一个错误。您收到了什么错误消息？

Answer 2

默认情况下，出于安全原因，web2py会转义视图中插入的所有文本。为避免这种情况，只需在控制器中使用XML()助手：

return dict(form_code=XML(result))

或在视图中：

{{=XML(form_code)}}

除非代码来自可靠来源，否则不要这样做 - 否则它可能包含恶意Javascript。

在web2py中的每个结束html标记之后添加换行符

2 个答案: