无法使用simple_html_dom遍历DOM来查找和遍历表行

时间:2016-05-25 10:41:51

标签: php parsing dom

我有一张包含theadtbody的表格 - 前几行是这样的:

<table class="general player-list-table">
    <thead>
        <tr>
            <th width="20"></th>
            <th class="header-name">
                <a href="playerlist.aspx?dpt=0&srt=1">Name</a>
            </th>
            <th width="20"></th>
            <th width="20"></th>
            <th>
                <a href="playerlist.aspx?dpt=0&srt=2">Club</a>
            </th>
            <th>
                <a href="playerlist.aspx?dpt=0&srt=3">Price</a>
            </th>
            <th>
                <a href="playerlist.aspx?dpt=0&srt=4">Rating</a>
            </th>
            <th>
                <a href="playerlist.aspx?dpt=0&srt=5" title="Games Played">PLD</a>
            </th>
            <th>
                <a href="playerlist.aspx?dpt=0&srt=6" title="Goals">GLS</a>
            </th>
            <th>
                <a href="playerlist.aspx?dpt=0&srt=7" title="Assists">ASS</a>
            </th>
            <th>
                <a href="playerlist.aspx?dpt=0&srt=8" title="Clean Sheets">CS</a>
            </th>
            <th>
                <a href="playerlist.aspx?dpt=0&srt=9" title="Goals Against">GA</a>
            </th>
            <th class="highlight">
                <a href="playerlist.aspx?dpt=0&srt=10" title="Month Points">MTH</a>
            </th>
            <th class="highlight header-tot">
                <a href="playerlist.aspx?dpt=0&srt=11" title="Overall total">TOT</a>
            </th>
        </tr>
    </thead>
    <tbody>
        <tr class="on">
            <td class="first no-border">
                <div class="pos pos1"></div>
            </td>
            <td class="highlight">
                <a href="/Classic/Stats/Player/petr-cech.aspx">P Cech</a>
            </td>
            <td class="highlight"></td>
            <td>
                <div class="left club-tiny club-tiny-arsenal"></div>
            </td>
            <td>
                <a href="/Classic/Stats/Club/arsenal.aspx">ARS</a>
            </td>
            <td class="highlight">4.2</td>
            <td>
                <div class="chilli normal"></div>
            </td>
            <td>34</td>
            <td>0</td>
            <td>0</td>
            <td>16</td>
            <td>31</td>
            <td class="highlight">2</td>
            <td class="highlight">35</td>
        </tr>
        <tr>
            <td class="first no-border">
                <div class="pos pos1"></div>
            </td>
            <td class="highlight">
                <a href="/Classic/Stats/Player/david-ospina.aspx">D Ospina</a>
            </td>
            <td class="highlight"></td>
            <td>
                <div class="left club-tiny club-tiny-arsenal"></div>
            </td>
            <td>
                <a href="/Classic/Stats/Club/arsenal.aspx">ARS</a>
            </td>
            <td class="highlight">4.1</td>
            <td>
                <div class="chilli normal"></div>
            </td>
            <td>4</td>
            <td>0</td>
            <td>0</td>
            <td>2</td>
            <td>5</td>
            <td class="highlight">0</td>
            <td class="highlight">3</td>
        </tr>
        <tr class="on">
            <td class="first no-border">
                <div class="pos pos2"></div>
            </td>
            <td class="highlight">
                <a href="/Classic/Stats/Player/hector-bellerin.aspx">H Bellerin</a>
            </td>
            <td class="highlight"></td>
            <td>
                <div class="left club-tiny club-tiny-arsenal"></div>
            </td>
            <td>
                <a href="/Classic/Stats/Club/arsenal.aspx">ARS</a>
            </td>
            <td class="highlight">4.2</td>
            <td>
                <div class="chilli normal"></div>
            </td>
            <td>36</td>
            <td>1</td>
            <td>5</td>
            <td>18</td>
            <td>33</td>
            <td class="highlight">4</td>
            <td class="highlight">52</td>
        </tr>
        <tr>
            <td class="first no-border">
                <div class="pos pos2"></div>
            </td>
            <td class="highlight">
                <a href="/Classic/Stats/Player/calum-chambers.aspx">C Chambers</a>
            </td>
            <td class="highlight"></td>
            <td>
                <div class="left club-tiny club-tiny-arsenal"></div>
            </td>
            <td>
                <a href="/Classic/Stats/Club/arsenal.aspx">ARS</a>
            </td>
            <td class="highlight">4.1</td>
            <td>
                <div class="chilli normal"></div>
            </td>
            <td>4</td>
            <td>0</td>
            <td>0</td>
            <td>2</td>
            <td>3</td>
            <td class="highlight">0</td>
            <td class="highlight">5</td>
        </tr>
        <tr class="on">
            <td class="first no-border">
                <div class="pos pos2"></div>
            </td>
            <td class="highlight">
                <a href="/Classic/Stats/Player/kieran-gibbs.aspx">K Gibbs</a>
            </td>
            <td class="highlight"></td>
            <td>
                <div class="left club-tiny club-tiny-arsenal"></div>
            </td>
            <td>
                <a href="/Classic/Stats/Club/arsenal.aspx">ARS</a>
            </td>
            <td class="highlight">4.2</td>
            <td>
                <div class="chilli normal"></div>
            </td>
            <td>3</td>
            <td>1</td>
            <td>0</td>
            <td>1</td>
            <td>6</td>
            <td class="highlight">0</td>
            <td class="highlight">2</td>
        </tr>
        <tr>
            <td class="first no-border">
                <div class="pos pos2"></div>
            </td>
            <td class="highlight">
                <a href="/Classic/Stats/Player/nacho-monreal.aspx">N Monreal</a>
            </td>
            <td class="highlight"></td>
            <td>
                <div class="left club-tiny club-tiny-arsenal"></div>
            </td>
            <td>
                <a href="/Classic/Stats/Club/arsenal.aspx">ARS</a>
            </td>
            <td class="highlight">4.3</td>
            <td>
                <div class="chilli normal"></div>
            </td>
            <td>36</td>
            <td>0</td>
            <td>3</td>
            <td>17</td>
            <td>34</td>
            <td class="highlight">4</td>
            <td class="highlight">42</td>
        </tr>
        <tr class="on">
            <td class="first no-border">
                <div class="pos pos3"></div>
            </td>
            <td class="highlight">
                <a href="/Classic/Stats/Player/gabriel-armando-de-abreu.aspx">Gabriel</a>
            </td>
            <td class="highlight"></td>
            <td>
                <div class="left club-tiny club-tiny-arsenal"></div>
            </td>
            <td>
                <a href="/Classic/Stats/Club/arsenal.aspx">ARS</a>
            </td>
            <td class="highlight">4.2</td>
            <td>
                <div class="chilli normal"></div>
            </td>
            <td>19</td>
            <td>1</td>
            <td>0</td>
            <td>10</td>
            <td>18</td>
            <td class="highlight">2</td>
            <td class="highlight">24</td>
        </tr>
        <tr>
            <td class="first no-border">
                <div class="pos pos3"></div>
            </td>
            <td class="highlight">
                <a href="/Classic/Stats/Player/laurent-koscielny.aspx">L Koscielny</a>
            </td>
            <td class="highlight"></td>
            <td>
                <div class="left club-tiny club-tiny-arsenal"></div>
            </td>
            <td>
                <a href="/Classic/Stats/Club/arsenal.aspx">ARS</a>
            </td>
            <td class="highlight">4.5</td>
            <td>
                <div class="chilli normal"></div>
            </td>
            <td>32</td>
            <td>4</td>
            <td>0</td>
            <td>15</td>
            <td>31</td>
            <td class="highlight">2</td>
            <td class="highlight">43</td>
        </tr>
        <tr class="on">
            <td class="first no-border">
                <div class="pos pos3"></div>
            </td>
            <td class="highlight">
                <a href="/Classic/Stats/Player/per-mertesacker.aspx">P Mertesacker</a>
            </td>
            <td class="highlight">
                <div class="right playerstatus doubtful" title="Hamstring"></div>
            </td>
            <td>
                <div class="left club-tiny club-tiny-arsenal"></div>
            </td>
            <td>
                <a href="/Classic/Stats/Club/arsenal.aspx">ARS</a>
            </td>
            <td class="highlight">4.4</td>
            <td>
                <div class="chilli normal"></div>
            </td>
            <td>23</td>
            <td>0</td>
            <td>0</td>
            <td>9</td>
            <td>24</td>
            <td class="highlight">0</td>
            <td class="highlight">17</td>
        </tr>
        <!-- etc -->

我正在尝试遍历每个tbody > tr,从每个单元格中输出某些值,例如:

  • 第一个单元格
  • 第二个单元格中a元素的明文
  • 第三和第四单元格中没有任何内容
  • 第五个单元格中a元素的明文
  • 第六个单元格的明文,

为了实现这一目标,我正在使用simple_html_dom库。

我的代码如下:

    foreach($dom->find("table.player-list-table tbody tr") as $row){
            $r["name"] = $row->find("td", 1)->find("a")->plaintext;
            $r["club"] = $row->find("td", 4)->find("a")->plaintext;
            $r["valu"] = $row->find("td", 5)->plaintext;

            print_r($r);
    }

然而,这输出:

  

PHP致命错误:未捕获错误:在/ var / www / html /.../ foo.php中调用null上的成员函数find():28

指向foreach下的第一行:

$r["name"] = $row->find("td", 1)->find("a")->plaintext;

我该如何做到这一点?

1 个答案:

答案 0 :(得分:2)

文档可能存在一些问题, 首先,您需要向 tbody 添加一个类,因为当您使用tbody标记查找行时,它还包括thead&#t>行,这就是您收到此错误的原因。 tbody标签现在改为

<tbody class="tbody">

和foreach循环也改为

foreach($dom->find("table.player-list-table .tbody tr") as $row){

好的,现在它显示了.tbody中的所有行。

但是另一个问题是,当我解析行并为 a 标记调用find函数时,它返回当前行中所有 a 的对象,表示是否有3连续 a 对象将为每个 a 包含3个对象,您必须定位第一个对象,然后调用明文来获取其内容,现在你在foreach循环中的代码也改为

$name = $row->find("td", 1)->find("a");
$r["name"] = $name[0]->plaintext;
$club = $row->find("td", 4)->find("a");
$r["club"] = $club[0]->plaintext;
$r["value"] = $row->find("td", 5)->plaintext;

我使用的是PHP版本5.3.10,因此我必须使用临时变量来存储对象。

此外,如果您不想将课程添加到tbody,则必须编辑simple_html_dom.php,以下是该如何处理https://stackoverflow.com/a/4062007/3303041

希望这可以帮助您解决问题。