我无法通过卷曲达到所需的网址

时间:2020-06-22 17:40:31

标签: php curl php-curl

我尝试卷曲一个网站以便在我的网站上显示它,但我总是陷入白页。我认为是因为它可以重定向到登录表单,但是我不确定这是因为失败的原因。您无需登录即可访问我使用的网址。

代码在这里

$url = "http://www.faf.es/pnfg/NPcd/NFG_CmpJornada?cod_primaria=1000120&CodCompeticion=16867461&CodGrupo=17910021&CodTemporada=15&CodJornada=26&Sch_Codigo_Delegacion=1&Sch_Tipo_Juego=2";


$ch = curl_init();


curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, FALSE);
curl_setopt($ch, CURLOPT_TIMEOUT, 99999);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, TRUE);

$output = curl_exec($ch);

curl_close($ch);

htmlentities($output);

html结果:

<html>
<head>
<title></title>
</head>
<body>
<!--
<h1>No se ha aceptado el cookie</h1>
-->
</body>
</html>

2 个答案:

答案 0 :(得分:0)

您应该再次访问htmlentities() documentation;它并没有按照您认为的那样做。

简而言之,htmlentities()返回一个string。您不会在任何地方捕获或输出此string结果。使用echo(或类似名称)确保将htmlentities($output)的结果输出到您期望的位置。

$url = "http://www.faf.es/pnfg/NPcd/NFG_CmpJornada?cod_primaria=1000120&CodCompeticion=16867461&CodGrupo=17910021&CodTemporada=15&CodJornada=26&Sch_Codigo_Delegacion=1&Sch_Tipo_Juego=2";


$ch = curl_init();


curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, FALSE);
curl_setopt($ch, CURLOPT_TIMEOUT, 99999);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, TRUE);

$output = curl_exec($ch);

curl_close($ch);

echo htmlentities($output);

答案 1 :(得分:0)

您的请求网址仅验证具有JSESSIONID Cookie的请求。因此,在您必须获取有效的cookie之前:

<?php

$url = "http://www.faf.es/pnfg/NPcd/NFG_CmpJornada?cod_primaria=1000120&CodCompeticion=16867461&CodGrupo=17910021&CodTemporada=15&CodJornada=26&Sch_Codigo_Delegacion=1&Sch_Tipo_Juego=2";

$ch = curl_init();
$useragent = 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.106 Safari/537.36';

curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_HEADER, true);
curl_setopt($ch, CURLOPT_REFERER, 'http://www.google.com/');
curl_setopt($ch, CURLOPT_USERAGENT, $useragent);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_TIMEOUT, 99999);


$output = curl_exec($ch);

curl_close($ch);
echo $output;

这为您提供了三个请求头:

HTTP/1.1 302 Movido temporalmente
Server: Apache-Coyote/1.1
Set-Cookie: JSESSIONID=648967C5FC907B1225EC61E9A65443E5; Path=/pnfg
Location: http://www.faf.es/pnfg/NLogin
Content-Length: 0 Date: Mon, 22 Jun 2020 19:33:30 GMT
Connection: close

HTTP/1.1 302 Movido temporalmente 
Server: Apache-Coyote/1.1
Cache-Control: no-cache
Pragma: no-cache 
Set-Cookie: JSESSIONID=9973308FA3AF81BE66C2F8C124671870; Path=/pnfg
Location: http://www.faf.es/pnfg/NLogin?NSess=1
Content-Length: 0 Date: Mon, 22 Jun 2020 19:33:30 GMT
Connection: close 

HTTP/1.1 200 OK Server: Apache-Coyote/1.1
Cache-Control: no-cache
Pragma: no-cache
Content-Type: text/html;charset=ISO-8859-15
Content-Length: 117
Date: Mon, 22 Jun 2020 19:33:30 GMT
Connection: close

我们有三个标头结果,因为我们说请求具有CURLOPT_FOLLOWLOCATION选项。因此,我们提出了三个要求。

我确定,第二个JSESSIONID cookie值有效。因此,为了获取网站信息,我们必须使用如下cookie:

<?php

$url = "http://www.faf.es/pnfg/NPcd/NFG_CmpJornada?cod_primaria=1000120&CodCompeticion=16867461&CodGrupo=17910021&CodTemporada=15&CodJornada=26&Sch_Codigo_Delegacion=1&Sch_Tipo_Juego=2";


$ch = curl_init();
$useragent = 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.106 Safari/537.36';

curl_setopt($ch, CURLOPT_URL, $url);
// Secondary request cookie.
curl_setopt($ch, CURLOPT_COOKIE, "JSESSIONID=9973308FA3AF81BE66C2F8C124671870");
curl_setopt($ch, CURLOPT_REFERER, 'http://www.google.com/');
curl_setopt($ch, CURLOPT_USERAGENT, $useragent);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_TIMEOUT, 99999);


$output = curl_exec($ch);

curl_close($ch);
echo $output;

网站内容出现后,请执行任何操作。