我想在div中提取div标签...
post.php文件:
<body>
<div class="home">
<div id="post_message_14674248">Content number 14674248</div>
<div id="post_message_14674255">Content number 14674255</div>
<div id="post_message_14674278">Content number 14674278</div>
<div id="post_message_14674279">Content number 14674279</div>
<div id="post_message_14674283">Content number 14674283</div>
<div id="post_message_14674290">Content number 14674290</div>
.
.
.
.
</div>
</body>
extract.php文件:
<?php
$html = file_get_contents("post.php");
$pattern = "/(<div id=\"post_message_)(.*)(<\/div>)/";
preg_match_all($pattern, $html, $matches);
print_r($matches);
?>
但它给了我一个空数组:
Array ( [0] => Array ( ) [1] => Array ( ) [2] => Array ( ) [3] => Array ( ) )
我想这样:
Content number 14674248
Content number 14674255
Content number 14674278
Content number 14674279
Content number 14674283
Content number 14674290
任何帮助?
答案 0 :(得分:1)
$html = new DOMDocument();
$html->loadHTMLFile("post.php");
$xpath = new DOMXPath($html);
$filtered = $xpath->query("//div[@class='home']/div");
foreach($filtered as $one){
echo $one->nodeValue."\n";
}
答案 1 :(得分:0)
验证file_get_contents()是否有效。如果我运行以下代码,我会得到结果:
<?php
$html = '<div class="home">
<div id="post_message_14674248">Content number 14674248</div>
<div id="post_message_14674255">Content number 14674255</div>
<div id="post_message_14674278">Content number 14674278</div>
<div id="post_message_14674279">Content number 14674279</div>
<div id="post_message_14674283">Content number 14674283</div>
<div id="post_message_14674290">Content number 14674290</div>
</div>
</body>';
$pattern = "/(<div id=\"post_message_)(.*)(<\/div>)/";
preg_match_all($pattern, $html, $matches);
print_r($matches);
?>
您可能还想将正则表达式更改为以下内容:
$pattern = "/<div id=\"post_message_.*?>(.*?)<\/div>/";