Question

在php中，我希望使用file_get_contents废弃一些网址。

对于大多数网址而言，它可以正常运行，但对于某些网址，例如walmart.com，buybuybaby.com。

源代码很简单，但有一个提取那些网址的技巧（walmart.com ......）??

我尝试使用file_get_contents，还尝试使用curl，但仍无效

提前感谢您提供任何帮助

$url="http://www.buybuybaby.com/";
$homepage = file_get_contents($url);
echo $homepage;

错误：警告：file_get_contents（https://www.buybuybaby.com/）：无法打开流：HTTP请求失败！ HTTP / 1.0 400错误请求

Answer 1

你应该使用curl而不是

function curl_get_content($url, $post = "", $refer = "", $usecookie = false)
{
    $curl = curl_init();
    curl_setopt($curl, CURLOPT_URL, $url);

    if ($post) {
        curl_setopt($curl, CURLOPT_POST, 1);
        curl_setopt($curl, CURLOPT_POSTFIELDS, $post);
    }

    if ($refer) {
        curl_setopt($curl, CURLOPT_REFERER, $refer);
    }

    curl_setopt($curl, CURLOPT_FOLLOWLOCATION, 1);
    curl_setopt($curl, CURLOPT_USERAGENT, "Mozilla/6.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.7) Gecko/20050414 Firefox/1.0.3");
    curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, false);
    curl_setopt($curl, CURLOPT_SSL_VERIFYHOST, false);
    //curl_setopt($curl, CURLOPT_TIMEOUT_MS, 5000);

    if ($usecookie) {
        curl_setopt($curl, CURLOPT_COOKIEJAR, $usecookie);
       curl_setopt($curl, CURLOPT_COOKIEFILE, $usecookie);
    }

    curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);

    $html = curl_exec($curl);
    if (curl_error($curl)) {
       echo 'Loi CURL : ' . (curl_error($curl));
    }
    curl_close($curl);
    return $html;
}

由于file_get_contents功能发送请求，请不要包含header信息或use-agent信息。 CURL生成类似浏览器请求的请求。而沃尔玛，亚马逊，脸书等......都没有拘留你的请求

file_get_contents / curl和400 Bad request for some urls（walmart ..）

1 个答案: