无法从PHP网页获取h1

时间:2019-03-04 12:52:17

标签: php

我要在this页上获取公司名称。 我尝试过的:

<?PHP
$html = file_get_contents('https://www.goudengids.be/bedrijf/Willebroek/L11159413/CNC+Metal/');
$document = new DOMDocument;
$document ->loadHTML($html);
$xPath = new DOMXPath($document);
$anchorTags = $xPath->evaluate("//div[@class=\"title-logo\"]//h1");
foreach ((array)$anchorTags  as $anchorTag) {
    echo 'name : '.$anchorTag;
}
?>

我在另一个网站上做了类似的事情,它确实起作用了,但实际上$ anchorTags数组似乎是空的。问题出在哪里? 谢谢。

3 个答案:

答案 0 :(得分:1)

您要查找的xpath是:

<Button.Style>
    <Style TargetType="{x:Type Button}">
        <Setter Property="Template">
            <Setter.Value>
                <ControlTemplate TargetType="{x:Type Button}">
                    <Ellipse Fill="{TemplateBinding Background}" Width="16" Height="16"/>
                </ControlTemplate>
            </Setter.Value>
        </Setter>
        <Style.Triggers>
            <Trigger Property="IsMouseOver" Value="True">
                <!--  Bind to custom color in ViewModel -->
                <Setter Property="Background" Value="{Binding CustomBrush}"/>
                <Setter Property="Cursor" Value="Hand"/>
            </Trigger>
        </Style.Triggers>
    </Style>
</Button.Style>

简单的//div[contains(@class,'title-logo')]//h1 不会

答案 1 :(得分:0)

您无需强制转换XPath evaluate()方法的结果以在foreach()中使用,还需要获取(我假设)nodeValue以获取实际的标头标签的内容...

foreach ($anchorTags  as $anchorTag) {
    echo 'name : '.$anchorTag->nodeValue;
}

将输出...

name : CNC Metal

答案 2 :(得分:0)

这对我有用:

$html = file_get_contents('https://www.goudengids.be/bedrijf/Willebroek/L11159413/CNC+Metal/');
$document = new DOMDocument;
@$document->loadHTML($html); // using @ here to suppress a warning

$headings = $document->getElementsByTagName('h1');
foreach ($headings as $node) {
    echo 'name : '.$node->nodeValue;
}