从<li> </li>获取文字

时间:2012-05-03 18:29:10

标签: php html parsing domparser

我在<li>内有一些<div>标签,如下所示:

<li> <a href="link1"> one <li>
<li> <a href="link2"> two <li>
<li> <a href="link3"> three <li>

如何使用HTML DOM解析器获取文本two,然后将其放入数组中以便以后使用?

2 个答案:

答案 0 :(得分:4)

您需要确保a标记已关闭,然后您可以这样做:

<?php 
$html = '<li> <a href="link1"> one </a> <li>
<li> <a href="link2"> two </a> <li>
<li> <a href="link3"> three </a> <li>
';

// Create a new DOM Document
$xml = new DOMDocument();

// Load the html contents into the DOM
$xml->loadHTML($html);

// Empty array to hold all links to return
$result = array();

//Loop through each <li> tag in the dom
foreach($xml->getElementsByTagName('li') as $li) {
    //Loop through each <a> tag within the li, then extract the node value
    foreach($li->getElementsByTagName('a') as $links){
        $result[] = $links->nodeValue;
    }
}
//Return the links
print_r($result);
/*
Array
(
    [0] =>  one 
    [1] =>  two 
    [2] =>  three 
)

*/
?>

全部在domDocument

的手册中

答案 1 :(得分:0)

考虑使用Simple HTML Dom Parser来实现这一目标。示例代码:

// include the simple html dom parser
include 'simple_html_dom.php'; 

// load the html with one of the sutiable methods available with it
$html = str_get_html('<li><a href="link1">one</a></li><li><a href="link2">two</a></li>');

// create a blank array to store the results
$items = array();

// loop through "li" elements and store the magic plaintext attribute value inside $items array
foreach( $html->find('li') as $li ) $items[] = $li->plaintext;

// this should output: Array ( [0] => one [1] => two ) 
print_r( $items );