simple_html_dom访问div里面的

时间:2017-03-14 10:19:53

标签: php simple-html-dom

我正在使用simple_html_dom来获取类html抓取脚本,我在尝试在div中抓取ul时遇到了问题

HTML

<div class="attributes">
      <div class="headline">test header</div>
                <ul>
                  <li>test 1</li>
                  <li>test 2</li>
                  <li>test 3</li>

                </ul>
    </div>

PHP

//call to function
$url = 'http://example.com';

$data = dlPage2($url,'.attributes');
echo $data;


//function

function dlPage2($href,$element) {

    $curl = curl_init();
    curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, FALSE);
    curl_setopt($curl, CURLOPT_HEADER, false);
    curl_setopt($curl, CURLOPT_FOLLOWLOCATION, true);
    curl_setopt($curl, CURLOPT_URL, $href);
    curl_setopt($curl, CURLOPT_REFERER, $href);
    curl_setopt($curl, CURLOPT_RETURNTRANSFER, TRUE);
    curl_setopt($curl, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US) AppleWebKit/533.4 (KHTML, like Gecko) Chrome/5.0.375.125 Safari/533.4");
    $str = curl_exec($curl);
    curl_close($curl);

    // Create a DOM object
    $dom = new simple_html_dom();
    // Load HTML from a string
    $dom->load($str);
$dom= $dom->find($element,0)->outertext;
    return $dom;
    }

上面的代码我可以抓取整个<div class="attributes">,但我需要在该div中获取<ul>标记的html,

有人可以帮我改变这个吗

1 个答案:

答案 0 :(得分:1)

您必须使用

<ul>内选择$element

$dom = $dom->find($element.' ul', 0)->outertext;