在div中提取div标签

时间:2012-09-04 18:05:09

标签: php arrays html preg-match

我想在div中提取div标签...

post.php文件:

<body>
<div class="home">

<div id="post_message_14674248">Content number 14674248</div>
<div id="post_message_14674255">Content number 14674255</div>
<div id="post_message_14674278">Content number 14674278</div>
<div id="post_message_14674279">Content number 14674279</div>
<div id="post_message_14674283">Content number 14674283</div>
<div id="post_message_14674290">Content number 14674290</div>
.
.
.
.
</div>
</body>

extract.php文件:

<?php 
$html = file_get_contents("post.php");
   $pattern = "/(<div id=\"post_message_)(.*)(<\/div>)/";
   preg_match_all($pattern, $html, $matches);
   print_r($matches);

?>

但它给了我一个空数组:

Array ( [0] => Array ( ) [1] => Array ( ) [2] => Array ( ) [3] => Array ( ) ) 

我想这样:

Content number 14674248
Content number 14674255
Content number 14674278
Content number 14674279
Content number 14674283
Content number 14674290

任何帮助?

2 个答案:

答案 0 :(得分:1)

$html = new DOMDocument(); 
$html->loadHTMLFile("post.php");
$xpath = new DOMXPath($html);
$filtered = $xpath->query("//div[@class='home']/div");

foreach($filtered as $one){
    echo $one->nodeValue."\n";
}

答案 1 :(得分:0)

验证file_get_contents()是否有效。如果我运行以下代码,我会得到结果:

<?php 
$html = '<div class="home">

<div id="post_message_14674248">Content number 14674248</div>
<div id="post_message_14674255">Content number 14674255</div>
<div id="post_message_14674278">Content number 14674278</div>
<div id="post_message_14674279">Content number 14674279</div>
<div id="post_message_14674283">Content number 14674283</div>
<div id="post_message_14674290">Content number 14674290</div>
</div>
</body>';
   $pattern = "/(<div id=\"post_message_)(.*)(<\/div>)/";
   preg_match_all($pattern, $html, $matches);
   print_r($matches);

?>

您可能还想将正则表达式更改为以下内容:

$pattern = "/<div id=\"post_message_.*?>(.*?)<\/div>/";