Question

我需要一个正则表达式来匹配标签中没有标签的内容。

<p>content1<a>content2 <span>content3</span></a> content4</p>
<a href="link">content1 <span>content2</span> content3</a>

目前我得到<.[^>]*>(.*?)<。但是标签被捕获了我想匹配content1 content2 ... 感谢。

Answer 1

您不需要正则表达式模式，只需将任何ID分配给P标签或使用jquery获取P标签并尝试跟随。

<p id="test">content1<a>content2 <span>content3</span></a> content4</p>

并在javascript中添加

var result = document.getElementById("test").innerText

Answer 2

最后，我找到了一种在没有子标签的情况下捕获内容的方法。

$html = fread($handle, filesize($argv[1]));⏎
preg_match_all('/<p[^>]*>(.*?)<\/p>|<a[^>]*>(.*?)<\/a>/', $html, $matchs);
foreach ($matchs[0] as $content)
  echo strip_tags($content);
//With the html upper I get all the content

重新考虑downvote请不要先在stackoverflow上提问，而不要先搜索很多...

捕获html标签的内容

2 个答案: