php从cURL无法运行的网站获取HTML

时间:2017-04-09 20:28:59

标签: php curl

我有一个问题。我正试图从一个带有cURL的网站获取HTML。它在这种模式下工作正常:

  $c = curl_init($url);
  curl_setopt($c, CURLOPT_RETURNTRANSFER, true);
  //curl_setopt(... other options you want...)

  $html = curl_exec($c);

  if (curl_error($c))
      die(curl_error($c));

  // Get the status code
  $status = curl_getinfo($c, CURLINFO_HTTP_CODE);

  curl_close($c);
  $html = str_get_html($html);

但是当我尝试将代码放入函数并在第二时刻调用它时,不行!这段代码带有一个函数:

function get_html_from_page($url) {
  $c = curl_init($url);
  curl_setopt($c, CURLOPT_RETURNTRANSFER, true);
  //curl_setopt(... other options you want...)

  $html = curl_exec($c);

  if (curl_error($c))
      die(curl_error($c));

  // Get the status code
  $status = curl_getinfo($c, CURLINFO_HTTP_CODE);

  curl_close($c);
  return $html;
}

//echo "Il contenuto del file è: ". $html . " con status: " . $status;
function get_last_vote_player() {

  echo '<table border="1"><theader>';
  echo '<td><b>Giocatore</b></td>';
  echo '<td><b>Voto</b></td>';
  echo '<td><b>Gol fatti/subiti</b></td>';
  echo '<td><b>Assist</b></td>';
  echo '<td><b>Rigori Realizzati</b></td>';
  echo '</theader><tbody>';

  $html = str_get_html(get_html_from_page('http://www.gazzetta.it/calcio/fantanews/voti/serie-a-2016-17/'));

  $dep = array();
  foreach($html->find('.magicTeamList') as $team) {
      $squadra = $team->first_child()->first_child()->last_child()->innertext;


      if (!in_array(ucwords($squadra),$dep)) {
          echo '<tr><td colspan="5"><b>'.ucwords($squadra).'</b></td></tr>';
          foreach($team->find('.playerName') as $player) {
              echo '<tr>';
              echo '<td>'.$player->find('a',0)->innertext . '</td>';
              //echo $player->innertext . '<br>';
              $voto = trim($player->next_sibling()->innertext);
              if ($voto == '-') $voto = 'S.V.'; else $voto = floatval($voto);
              echo '<td>' . $voto . '</td>';
              $golFatti = $player->next_sibling();
              echo '<td>'.$golFatti->next_sibling()->innertext . '</td>';
              $assist = $golFatti->next_sibling();
              echo '<td>'.$assist->next_sibling()->innertext . '</td>';
              $rigoriRealizzati = $assist->next_sibling();
              echo '<td>'.$rigoriRealizzati->next_sibling()->innertext . '</td>';
              echo '</tr>';
          }
      }

      $dep[] = ucwords($squadra);
  }
  echo '</tbody></table>';
}


get_last_vote_player();

而且当我尝试将它调用到这样的类中时不起作用:

class WebPage {
    public $url = '';

    public function __construct($url) {
        $this->url = $url;
    }

    protected function get_html_from_page() {
      $c = curl_init($this->url);
      curl_setopt($c, CURLOPT_RETURNTRANSFER, true);
      //curl_setopt(... other options you want...)

      $html = curl_exec($c);

      if (curl_error($c))
          die(curl_error($c));

      // Get the status code
      $status = curl_getinfo($c, CURLINFO_HTTP_CODE);

      curl_close($c);
      return $html;
    } 
}

class PlayerVote extends WebPage {

    public function __construct($url) {
        parent::__construct($url);
    }

    public function get_last_vote_player() {

      echo '<table border="1"><theader>';
      echo '<td><b>Giocatore</b></td>';
      echo '<td><b>Voto</b></td>';
      echo '<td><b>Gol fatti/subiti</b></td>';
      echo '<td><b>Assist</b></td>';
      echo '<td><b>Rigori Realizzati</b></td>';
      echo '</theader><tbody>';

      $html = str_get_html($this->get_html_from_page('http://www.gazzetta.it/calcio/fantanews/voti/serie-a-2016-17/'));

      $dep = array();
      foreach($html->find('.magicTeamList') as $team) {
          $squadra = $team->first_child()->first_child()->last_child()->innertext;


          if (!in_array(ucwords($squadra),$dep)) {
              echo '<tr><td colspan="5"><b>'.ucwords($squadra).'</b></td></tr>';
              foreach($team->find('.playerName') as $player) {
                  echo '<tr>';
                  echo '<td>'.$player->find('a',0)->innertext . '</td>';
                  //echo $player->innertext . '<br>';
                  $voto = trim($player->next_sibling()->innertext);
                  if ($voto == '-') $voto = 'S.V.'; else $voto = floatval($voto);
                  echo '<td>' . $voto . '</td>';
                  $golFatti = $player->next_sibling();
                  echo '<td>'.$golFatti->next_sibling()->innertext . '</td>';
                  $assist = $golFatti->next_sibling();
                  echo '<td>'.$assist->next_sibling()->innertext . '</td>';
                  $rigoriRealizzati = $assist->next_sibling();
                  echo '<td>'.$rigoriRealizzati->next_sibling()->innertext . '</td>';
                  echo '</tr>';
              }
          }

          $dep[] = ucwords($squadra);
      }
      echo '</tbody></table>';
    }
}

$vote = new PlayerVote('http://www.gazzetta.it/calcio/fantanews/voti/serie-a-2016-17/');
$vote->get_last_vote_player();

无法正常工作,因为当我尝试转到该页面时,它的响应是“ 无法访问网站“

为什么?

1 个答案:

答案 0 :(得分:1)

当你尝试在类中调用它时...错误就在这一行:

$ html = str_get_html($ this-&gt; get_html_from_page(&#39; http://www.gazzetta.it/calcio/fantanews/voti/serie-a-2016-17/&#39;));

你不需要传递get_html_from_page中的url,因为该函数不接受任何输入...使用$ this-&gt; url ....获取url ....

将其更改为:   $ html = str_get_html($ this-&gt; get_html_from_page());

它应该有效