Question

我想从html中获取段落或div，但如果它不包含表单。例如：

<p><form>I don't want this text</form>and not this text</p>
<p>I want to take this text</p>

我有工作变体，没有表格过滤器。

/(?:<(?:p|div)[^>]*>)(.*)(?:<\/(?:p|div)>)/iu

并没有使用过滤器

的变体

/(?:<(?:p|div)[^>]*>)((?:.(?!<form))*)(?:<\/(?:p|div)>)/iu

你能帮帮我吗？

Answer 1

警告：使用Regexp解析HTML一直是，并且永远是个坏主意。

以下是正则表达式的略微修改版本：

/(?:<(?:p|div)[^>]*>)(?!.*\<form\>)(.*)(?:<\/(?:p|div)>)/iu

我对其进行了改进，以便您能够捕获包含单词“form（而不是标记）”的任何段落。请尝试使用此测试：

<p><form>I don't want this text</form>and not this text</p>
<p>I want to take this text even if it contains the "form" word!</p>
<p>I want to take this text</p>

正则表达式匹配字符串没有子串

1 个答案: