Question

我正在尝试从网站获取特定div class="className"的内容，然后将内容存储到数据库。我使用这段代码但var_dump没有显示任何内容。请帮助我，因为我是完全缺乏经验。

代码：

<?php

  $doc = new DOMDocument();
  $doc->loadHTMLFile('http://www.someLink.com');

  foreach( $doc->getElementsByClassName('Classname') as $item){
    $class =  $item->getAttribute('div');
    var_dump($class);
 }

 ?>

Answer 1

DOMDocument-＆gt; getElementsByClassName似乎不是一个存在的函数。

尝试使用xpath，如下所示：

<?php
    $doc = new DOMDocument();
    $doc->loadHTMLFile('http://www.image-plus.co.uk/');

    $finder = new DomXPath($doc);
    $class_name = "green";
    $nodes =  $finder->query("//*[contains(concat(' ', normalize-space(@class), ' '), ' $class_name ')]");

    $tmp_dom = new DOMDocument(); 
    foreach ($nodes as $node) 
    {
        $tmp_dom->appendChild($tmp_dom->importNode($node,true));
    }
    $innerHTML.=trim($tmp_dom->saveHTML()); 
    echo $innerHTML;
?>

编辑：修正了错误

Answer 2

我创建了一个从div获取内容的示例。这个内容可以轻松存储在数据库中。

$html = file_get_html('Your website');
$element = $html->find('div[id=Your id]', 0);
echo $element;

Answer 3

我编写了一个小函数，可以创建在类

中找到的div元素数组

<?php

function get_links($url,$classname) {

    // Create a new DOM Document 
    $xml = new DOMDocument('1.0', 'UTF-8');

    //To remove all unnecessary errors
    $internalErrors = libxml_use_internal_errors(true);

    // Load the html into the DOM
    $xml->loadHTMLFile($url);

    $xpath = new DOMXPath($xml);

    $classes = $xpath->query("//*[contains(concat(' ', normalize-space(@class), ' '), ' $classname ')]");

    for ($i = $classes->length - 1; $i > -1; $i--) {
        if(!empty($classes->item($i)->firstChild->nodeValue)){
            $result[] = $classes->item($i)->firstChild->nodeValue;
        }
    }
    // Restore error level
    libxml_use_internal_errors($internalErrors);

    return $result;

}


  $url = 'http://www.example.com';
  $classname ="someclass";
  $rows=get_links($url,$classname);

  var_dump($rows); // YOu will get an array of the contents that you can store in database
  foreach($rows as $row){
   //insert DB command
  }

 ?>

使用PHP获取div类的内容并存储到数据库表

3 个答案: