Question

我想用类 viewContent 提取div标签的所有内容但是当我执行我的代码时问题是当到达div的第一个结束标记时php停止。我该怎么办？我有下面的示例代码，但仍然只有第一个div标签得到。谢谢你们帮助我。

  preg_match_all('#<div class="viewContent"[^>]*>(.*?)</div[^>]*>#is', $content, $s);
    print_r($s);

Answer 1

懒惰或贪婪的搜索在这里几乎没用，因为它必然匹配</div>，而<div class="viewContent">与<div class="viewControl">不对应。所以最终评论可以在这里使用，因为逻辑标志着所需分工的结束。

使用以下正则表达式只能获得<div class="viewContent"[^>]*>(.*?)<\/div[^>]*>(?=)的内容。

正则表达式： <div class="viewContent"[^>]*>(.*?)<\/div[^>]*>

<强>解释

(?=)这与使用延迟搜索的部门匹配。
positively looks ahead此<div>用于评论逻辑标记SELECT TOP (100) PERCENT COLUMN_NAME FROM INFORMATION_SCHEMA.COLUMNS WHERE (TABLE_NAME = 'Raw_TESTB') AND (ORDINAL_POSITION >= '2') OR (TABLE_NAME = 'Raw_TESTC') AND (ORDINAL_POSITION >= '5') AND (COLUMN_NAME = '1hr_avg' OR COLUMN_NAME = 'MA_O7_1hr' OR COLUMN_NAME = 'Am_te_avg' OR ORDER BY TABLE_NAME DESC的结尾

<强> Regex101 Demo

Answer 2

If you can guarantee that the closing tag for the div you want ends with , you can use:

<div class="viewContent"[^>]*>(.*?)</div[^>]*>

Otherwise, you might just want to use an HTML parser.

Answer 3

You can use PHPs built in DOMDocument class to parse the html of the page and use the DOMXPath class to extract the value of an HTML element with a certain HTML class:

<?php
$html = '';//HTML goes here
$doc = new DOMDocument();
@$doc->loadHTML($html);
$classname = "viewContent";
$finder = new DomXPath($doc);
$spanner = $finder->query("//*[contains(@class, '$classname')]");
foreach ($spanner as $entry) {
  echo $entry->nodeValue;
}

Preg_match在div标签中获取div标签中的内容

3 个答案: