发现了很多相关的链接,但在我想要的方面没有任何内容。我想要一个正则表达式来匹配否定的打开和关闭标记。以此字符串为例:
<p>This <em>is</em> <span>a</span> <b>sentence</b>.</p>
我使用正则表达式匹配<em>
和<b>
,同时单独留下<p>
和<span>
。我使用以下正则表达式执行此操作:
<(?!p|span)[^>]*>
问题是,上述内容将匹配</p>
和</span>
。我也希望将这些结束标签保留下来。我试过了:
<(/)?(?!p|span)[^>]*>
和它的不同组合,但我尝试过没有任何作品。希望我能得到一些帮助。如何设置正则表达式以匹配这些,而不使用执行类似的操作:<(?!p|span)[^>]*>(.*?)</(?!p|span)[^>]*>
(看起来很糟糕,可能需要更多资源)。
答案 0 :(得分:3)
试试这个:
(?:<(em|b)[^<>]*?>)([^<>]+)(?=</\1>)
<强>解释强>
<!--
(?:<(em|b)[^<>]*?>)([^<>]+)(?=</\1>)
Options: case insensitive; ^ and $ match at line breaks
Match the regular expression below «(?:<(em|b)[^<>]*?>)»
Match the character “<” literally «<»
Match the regular expression below and capture its match into backreference number 1 «(em|b)»
Match either the regular expression below (attempting the next alternative only if this one fails) «em»
Match the characters “em” literally «em»
Or match regular expression number 2 below (the entire group fails if this one fails to match) «b»
Match the character “b” literally «b»
Match a single character NOT present in the list “<>” «[^<>]*?»
Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
Match the character “>” literally «>»
Match the regular expression below and capture its match into backreference number 2 «([^<>]+)»
Match a single character NOT present in the list “<>” «[^<>]+»
Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
Assert that the regex below can be matched, starting at this position (positive lookahead) «(?=</\1>)»
Match the characters “</” literally «</»
Match the same text as most recently matched by capturing group number 1 «\1»
Match the character “>” literally «>»
-->
此模式用于匹配具有开始和结束对的整个标记数据。
但是如果您只想删除标签,可以使用:
</?(em|b)[^<>]*?>