Question

发现了很多相关的链接，但在我想要的方面没有任何内容。我想要一个正则表达式来匹配否定的打开和关闭标记。以此字符串为例：

<p>This <em>is</em> <span>a</span> <b>sentence</b>.</p>

我使用正则表达式匹配和，同时单独留下和。我使用以下正则表达式执行此操作：

<(?!p|span)[^>]*>

问题是，上述内容将匹配和。我也希望将这些结束标签保留下来。我试过了：

<(/)?(?!p|span)[^>]*>

和它的不同组合，但我尝试过没有任何作品。希望我能得到一些帮助。如何设置正则表达式以匹配这些，而不使用执行类似的操作：<(?!p|span)[^>]*>(.*?)</(?!p|span)[^>]*>（看起来很糟糕，可能需要更多资源）。

Answer 1

试试这个：

(?:<(em|b)[^<>]*?>)([^<>]+)(?=</\1>)

<强>解释

<!--
(?:<(em|b)[^<>]*?>)([^<>]+)(?=</\1>)

Options: case insensitive; ^ and $ match at line breaks

Match the regular expression below «(?:<(em|b)[^<>]*?>)»
   Match the character “<” literally «<»
   Match the regular expression below and capture its match into backreference number 1 «(em|b)»
      Match either the regular expression below (attempting the next alternative only if this one fails) «em»
         Match the characters “em” literally «em»
      Or match regular expression number 2 below (the entire group fails if this one fails to match) «b»
         Match the character “b” literally «b»
   Match a single character NOT present in the list “<>” «[^<>]*?»
      Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
   Match the character “>” literally «>»
Match the regular expression below and capture its match into backreference number 2 «([^<>]+)»
   Match a single character NOT present in the list “<>” «[^<>]+»
      Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
Assert that the regex below can be matched, starting at this position (positive lookahead) «(?=</\1>)»
   Match the characters “</” literally «</»
   Match the same text as most recently matched by capturing group number 1 «\1»
   Match the character “>” literally «>»
-->

此模式用于匹配具有开始和结束对的整个标记数据。

但是如果您只想删除标签，可以使用：

</?(em|b)[^<>]*?>

正则表达式否定关闭/打开标签

1 个答案: