我需要搜索字符串,这可能是这样的:
<div class="icon_star"> </div>
或
<div class="icon_star"></div>
或
<div class="icon_star"> </div>
我需要在HTML中搜索上面的字符串,这可能是这样的:
<h1 class="redword" tag="h1">
<span class="BASE">good</span>
</h1>
<span class="headword-definition"> - definition</span>
</span>
<div class="icon_star"></div>
<!-- End of DIV icon_star-->
<div class="icon_star"></div>
<!-- End of DIV icon_star-->
<div class="icon_star"></div>
<!-- End of DIV icon_star-->
</div><!-- End of DIV -->
<div class="headbar">
<div id="helplinks-box" class="responsive_hide_on_smartphone">
我们尝试在数组中搜索和存储的字符串可以多次
我尝试使用以下正则表达式:
preg_match_all ('/<div(\s)+class="icon_star">(.*?)<\/div>/i', $html1, $result_array1);
当要搜索的HTML
时,上面的正则表达式不起作用<div id="headword">
<div id="headwordright">
<div style="display: none;" id="showmore"><a class="button" onmousedown="foldingSet(false)"><span class="label">Show more</span></a>
</div><!-- End of DIV -->
<div id="showless"><a class="button" onmousedown="foldingSet(true)"><span class="label">Show less</span></a>
</div><!-- End of DIV -->
</div><!-- End of DIV -->
<span class="BASE-FORM">
<h1 tag="h1" class="redword"><span class="BASE">scenario</span></h1>
<span class="headword-definition"> - definition</span>
</span>
<div class="icon_star"> </div><!-- End of DIV icon_star-->
</div>
答案 0 :(得分:3)
更新
您似乎正在以错误的方式阅读正则表达式结果。执行
preg_match_all('/<div(\s)+class="icon_star">.*?<\/div>/i', $html, $result_array1);
for($x = 0; $x < count($result_array1); $x++)
$result_array1[$x] = array_map('htmlentities', $result_array1[$x]);
echo '<pre>' . print_r($result_array1, 1);
打印出来
Array
(
[0] => Array
(
[0] => <div class="icon_star"> </div>
)
[1] => Array
(
[0] =>
)
)
因此您应该检查$result_array1[0]
而不是$result_array1
旁注
而不是使用正则表达式解析HTML,如果可以的话,可以使用PHP内置的DOMDocument
类。
使用以下代码提取三个div。
请注意,您需要拥有有效的HTML才能使用此方法。
//your HTML with tag added to make it valid
$html = '<div>
<h1 class="redword" tag="h1">
<span class="BASE">good</span>
</h1>
<span class="headword-definition"><span> - definition</span></span>
<div class="icon_star"></div>
<div class="icon_star"></div>
<div class="icon_star"></div>
</div>
<div class="headbar">
<div id="helplinks-box" class="responsive_hide_on_smartphone">
</div>
</div>';
$dom = new DOMDocument();
@$dom->loadHTML($html);
$x = new DOMXPath($dom);
//this xpath query looks for all nodes that have "class" attribute value equal to "icon_star"
$nodes = $x->query("//*[contains(@class, 'icon_star')]");
$res = '';
foreach($nodes as $node) {
/**
* @var $node DOMElement
*/
$res .= $dom->saveHTML($node);
}
echo htmlentities($res);
您可以在stackoverflow上阅读以下有用的问题
How do you parse and process HTML/XML in PHP?
Getting DOM elements by classname