PHP:Ganon dom解析器过滤锚标签

时间:2013-08-08 05:38:30

标签: php dom html-parsing

我正在使用PHP Dom Parser库,我有一个复杂的HTML结构来解析:

<table width="640" style="color: #333333;">
<tbody><tr>
<td valign="top">
<font face="Arial,Helevetica,sans-serif">
<a href="http://forums.timezone.com/index.php?t=tree&amp;goto=6577581&amp;rid=0">20mm Omega SMP Bond Bracelet Ref. 1503-825- PRICE DROP</a><br>
<font size="-1" color="#999999">Sales Corner - <a href="http://forums.timezone.com/index.php?t=usrinfo&amp;id=462&amp;rid=0">The Bigwatch Guy</a></font><font size="-1" color="#999999"> - Aug 7, 2013</font><br>
<font size="-1">20mm OMEGA SEAMASTER PROFESSIONAL "BOND" BRACELET REF. 1503-825. All s/s genuine Bond bracelet in excellent condition. The bracelet is 6.6 inches long...</font>
<br>
<br>
</font></td>
</tr>
<tr>
<td valign="top">
<font face="Arial,Helevetica,sans-serif">
<a href="http://forums.timezone.com/index.php?t=tree&amp;goto=6577577&amp;rid=0">Longines Lindbergh Hour Angle Chronograph- PRICE DROP</a><br>
<font size="-1" color="#999999">Sales Corner - <a href="http://forums.timezone.com/index.php?t=usrinfo&amp;id=462&amp;rid=0">The Bigwatch Guy</a></font><font size="-1" color="#999999"> - Aug 7, 2013</font><br>
<font size="-1">42mm (not counting the crown) LONGINES LINDBERGH HOUR ANGLE AUTOMATIC CHRONOGRAPH W/ COMPLETE BOXSET AND PAPERS - NEARMINT PLUS CONDITION. The strap h...</font>
<br>
<br>
</font></td>
</tr>
</table>

我正在尝试获取其href属性包含goto字符串的所有锚标记 ,我尝试使用以下代码:

<?php 
include("ganon.php");
$html = file_get_dom('http://forums.timezone.com/search/?q=Public+Forum&f=4&s=0');
$c=1;
if( count($html("table[width='640']"))>0 ){
    foreach($html("a[href=*goto]") as $elm){
            echo $c.')'.$elm->href.'<br/>';
    $c++;
    }
}
?>

上面的代码抛出了这个通知:Notice: Expected identifier at 7! in D:\xampp\htdocs\govberg\ganon.php on line 2196没有其他输出。

1 个答案:

答案 0 :(得分:1)

从选择器documentation中,您可以看到:

  

E[foo*="bar"] :一个E元素,其中&#34; foo&#34;属性值包含子串&#34; bar&#34;

您使用不当。

更改以下行:

foreach($html("a[href=*goto]") as $elm){

为:

foreach($html('a[href*="goto"]') as $elm)

输出:Pastebin

希望这有帮助!