Question

我正在尝试从html页面中提取标题文本并将其插入到对象中。我正在使用symphony和php。 filterXPATH的结果似乎不是纯文本，而是整个html页面和抛出错误。我不知道为什么。

我的代码是：

$html =  $this->file_get_contents_curl("http://www.google.com/");
$urlData = [];
$crawler = new Crawler($html);
$urlData->title = $crawler->filterXPath('//title')->extract('_text');

如果我这样做，我会看到标题文字：

return $crawler->filterXPath('//title')->extract('_text');

Answer 1

试试这个，

libxml_use_internal_errors(true);
$html =  file_get_contents("http://www.google.com/");
$dom1 = new DOMDocument;
$dom1->preserveWhiteSpace = false;
$dom1->loadHTML($html);
$xp = new DOMXPath($dom1);
$xp->registerNamespace("php", "http://php.net/xpath");
$urlData= $xp->query('//title');
foreach($urlData as $title) {
echo $title->textContent;
}

将filterXPATH结果转换为文本

1 个答案: