Question

我正在尝试列出页面上的所有链接和名称。我一直在为下面的代码获取银行输出

$url="http://www.ciim.in/top-pr-dofollow-social-bookmarking-sites-list-2016";
$html = file_get_contents($url);

，节点部分是

$nodes = $my_xpath->query( '//table[@class="social_list"]/tbody/tr' );

    foreach( $nodes as $node )
    {

    $title  = $my_xpath->evaluate( 'td[1]/a"]', $node );
    $link  = $my_xpath->evaluate( 'td[1]/a/@href"]', $node );

    echo $title.",".$link."<br>";        

    }

注意右键单击该站点已被禁用，我使用开发人员工具检查chrome中元素的代码

Answer 1

查询

$nodes = $xpath->query('//table[@class="social_list"]/tbody/tr/td/a');

在foreach中获取标题和网址

$title = $node->textContent;
$href = $node->getAttribute('href');

编辑：我已经测试了这段代码来检索整个表格

//Query from parent div
$nodes = $xpath->query('//div[@class="table_in_overflow"]');

foreach ($nodes as $node) {
    $a = $node->getElementsByTagName("a");
    foreach($a as $item) {
      $href =  $item->getAttribute("href");
      $text = $item->nodeValue;
    }
}

Answer 2

您的选择器"]和'td[1]/a"]'末尾有'td[1]/a/@href"]'，因此请将其更改为td[1]/a和td[1]/a/@href。

此外，您只需选择带有tr td的{{1}}来改善您的xpath，这样就会忽略没有链接的标头。

比'//table[@class="social_list"]/tbody/tr[td/a]'

效率更高

返回xpath的空白输出

2 个答案: