So someone actually posted a fantastic solution here How can I escape all code within <code></code> tags to allow people to post code?
The problem is that this only works if it's <code></code>
. However, this breaks with <code id="lol"></code
for example, since it contains an attribute. How can I account for this, in order to strictly escape strings inside the code tag, whether or not it has any attributes.
I apologize if there is an obvious solution to this. Regexes give me nightmares.
Edit
As I explained in the question initially, the post that is supposedly a duplicate does not account for the <code>
tag with something like a class or any other attributes.
答案 0 :(得分:1)
尽管我在上面发表评论,但我仍会努力为您提供正则表达式。但是,我强调 不 建议使用正则表达式,而是使用HTML解析器。
你的正则表达式应该看起来像这样:
<\s*code(.*?)>(.+?)<\s*\/code\s*>
稍微分解一下,
\s*
匹配零个或多个空白字符。
code
匹配文字字符串&#34; code
&#34;。
.*?
是零个或多个字符的 lazy 匹配。它将匹配所有内容(如果有的话)直到标记的末尾。
(.+?)
是捕获组,包含一个或多个字符的延迟匹配。如果不出意外,您的<code>
标签将永远不会完全空白(因为它们之间必须至少有一个字符)。
最后,<\s*\/code\s*>
匹配结束标记,可能包含空格。请注意,斜杠(/
)字符是转义的,因为它必须在阳光下几乎所有的正则表达式中。