Question

目前正在尝试使用名为PHP Simple HTML Dom Parser的DOM抓取库来抓取一些HTML。

我有以下方法：

public function getFourLevels() {
    // Iterate through the four pollen levels [Wunderground only has four day
    // pollen prediction]
    for($i = 0; $i < 4; $i++) {

        // Get the raw level
        $rawLevels = $this->html
            ->find("td.text-center.even-four", $i)
            ->plaintext;

        // Clean the raw level
        $level = substr(
            $rawLevels,
            PollenBuddy::LEVELS
        );

        // Push each date to the dates array
        array_push($this->levels, $level);
    }

    return $this->levels;
}

上述方法是我尝试抓取以下HTML：

    <td class="text-center even-four">
    <strong>Sunday</strong>
    <div>February 15, 2015</div>
    </td>
    <td class="text-center even-four">
    <strong>Monday</strong>
    <div>February 16, 2015</div>
    </td>
    <td class="text-center even-four">
    <strong>Tuesday</strong>
    <div>February 17, 2015</div>
    </td>
    <td class="text-center even-four">
    <strong>Wednesday</strong>
    <div>February 18, 2015</div>
    </td>

以下是来源document。

我使用var_dump从上述函数得到的结果是：

array(4) {
  [0]=>
  bool(false)
  [1]=>
  bool(false)
  [2]=>
  bool(false)
  [3]=>
  bool(false)
}

不太确定是什么问题。如果有人可以给我一些建议 - 谢谢！

Answer 1

使用：

        // Get the raw level
        $rawLevel = $this->html
            ->find("td.even-four", $i)
            ->plaintext;

返回正确的数据。

从此链接中找到：simple html dom - space in class name

PHP简单DOM解析器返回布尔值/在刮擦方法中为空

1 个答案: