Question

我创建一个搜索特定文档名称的代码（例如：SZ-1000）并收集div class="box"包含的所有链接。（index.php）文档名称可能包含一个或两个文档带有身份证。（26904,26905）它就像一个魅力。我收回了ID。

但我想列出链接包含的所有附件作为链接。现在的诀窍是，没有div元素，只有table或dd来定义附件位置。

网站＆gt;
- 文档名称（SZ-1000）＆gt;
  - 文件ID＆gt;
    - 附件链接

我认为该部分有问题：

$xpath->query('//table[@id="content clear-block"]');

Catchable fatal error: Object of class DOMNodeList could not be converted to string in C:\AppServ\www\test\import.php on line 35

import.php中的var_dump($articles)的结果;

object(DOMNodeList)#3 (0) { }
object(DOMNodeList)#3 (0) { }

我的代码：

的index.php

$site = 'http://192.168.0.1:81/?q=search/node/SZ-1000';
$html = file_get_contents($site);
$doc = new DOMDocument();
$doc->loadHTML($html);
$xpath = new DOMXpath($doc);
$articles = $xpath->query('//div[@class="box"]');

$links = array();
   foreach($articles as $container) {
   $arr = $container->getElementsByTagName("a");
     foreach($arr as $item) {
      $href =  $item->getAttribute("href");
      $links[] = $href;
     }
}
   foreach($links as $link){
     $text = end(split('/',$value));
     echo $text."<br>";
     $wr_out = file_get_contents("http://127.0.0.1/test/import.php?value=".$text);
     echo $wr_out;
  }

import.php

$value = $_GET['value'];
$site = 'http://192.168.0.1:81/?q=node/'.$value;
$html = file_get_contents($site);
$doc = new DOMDocument();
$doc->loadHTML($html);
$xpath = new DOMXpath($doc);
$articles = $xpath->query('//table[@id="content clear-block"]');

$links = array();
   foreach($articles as $container) {
   $arr = $container->getElementsByTagName("a");
      foreach($arr as $item) {
      $href =  $item->getAttribute("href");
      $links[] = $href;
      echo $href;
     }
}

感谢您的回复！

编辑：

＆＃39; Catchable致命错误：类DOMNodeList的对象不可能在第35行的C：\ AppServ \ www \ test \ import.php中转换为字符串＆＃39;

错误修正：

echo $ wr_out-＆gt; tagName;

Answer 1

Oookay，最后我做到了。这是后世的解决方案。使用UTF-8 char编码。

的index.php

    <?php

//从外部源获取一些变量我将其用于Google电子表格

    $get = $_GET['get'];
    $site = 'http://192.168.0.1:81/?q=search/node/'.$get;
    $html = file_get_contents($site);

    //libxml_use_internal_errors(true);

    $doc = new DOMDocument();
    $doc->loadHTML($html);

    $xpath = new DOMXpath($doc);
    $articles = $xpath->query('//div[@class="box"]');

    if(!empty($articles)){

    $links = array();
    foreach($articles as $container) {
       $arr = $container->getElementsByTagName("a");
       foreach($arr as $item) {
          $href =  $item->getAttribute("href");
          $links[] = $href;
       }
    }
    $wr_out = "";

    foreach($links as $value){
        $text = end(split('/',$value));
        $wr_out.=file_get_contents("http://127.0.0.1/projekt/search/import.php?value=".$text);

    }

    if(empty($wr_out))
        echo "There is no document with that ID";
        else
    echo $wr_out;
    }
    else
    echo "There is no document with that ID";
    ?>

import.php

    $value = $_GET['value'];
    $site = 'http://192.168.0.1:81/?q=node/'.$value;
    $html = file_get_contents($site);


    //libxml_use_internal_errors(true);

    $doc = new DOMDocument();
    $doc->loadHTML($html);



    $elements = $doc->getElementsByTagName('tbody');
    $table = $elements->item(0);

    $rows = $table->childNodes;

        foreach ($rows as $node) {

          if($node->tagName == "tr"){

            $a = $node->firstChild->firstChild;

             foreach ($a->attributes as $attr) {
                if($attr->nodeName == "href"){
                    $value = $attr->nodeValue;
                    ?>
                        <!doctype html>
                        <head>
                            <title>Search</title>
                          <meta charset="UTF-8">
                        </head>
                        <body>
                            <table align="center">

                                <tr>
                                    <td></td>
                                    <td class="style-1">
                                    <br><h3>
                                    <?=$value?> | <a href='<?=$value?>'>LINK</a></h3><hr>
                                    </td>
                                </tr>
                            </table>
                        </body><?

                }
            }
         }
    }?>

需要DOMXPath帮助

1 个答案: