Question

我知道这是使用Curl时的一个常见问题，但在浏览StackOverflow和Google之后我还没有找到解决方案。

我尝试过不同的用户代理，但我遇到了不同的错误：

请求的网址返回错误：400 Bad Requestresource（19）类型（未知）
请求的网址返回错误：400 Bad Requeststring（42）类型（未知）（我注意到42指的是$ target_url中的'='）

取决于我对下面的代码所做的一些修改，但是没有人指出我解决这个问题的方向。

我感谢任何建议：

$target_url = "http://www.hockeydb.com/ihdb/stats/pdisplay.php?pid=170307";

    $ch = curl_init();

    curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)');
    curl_setopt($ch, CURLOPT_URL,$target_url);
    curl_setopt($ch, CURLOPT_FAILONERROR, true);
    //curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
    curl_setopt($ch, CURLOPT_AUTOREFERER, true);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER,true);
    curl_setopt($ch, CURLOPT_TIMEOUT, 10);
    $html = curl_exec($ch);
    if ($html === false) $html = curl_error($ch);
    echo stripslashes($html);
    curl_close($ch);

    var_dump($ch);

***我应该注意到我实际上正在从文件中读取网址（以及其他一些网址），所以网址的格式可能有问题？我以前做过这个并没有问题，但现在我很难过。我读取每一行/ url并将其放入一个我稍后循环的数组中。

***如果我对网址进行硬编码，那么它可以正常工作，但由于某种原因，从文件中读取它会产生错误。

Answer 1

请勿使用preg_replace()使用<?php $target_url="http://www.hockeydb.com/ihdb/stats/pdisplay.php?pid=170307"; $ch = curl_init(); curl_setopt($ch, CURLOPT_URL,$target_url); curl_setopt($ch, CURLOPT_HEADER, 0); curl_setopt($ch, CURLOPT_CONNECTTIMEOUT ,4); curl_setopt($ch, CURLOPT_FAILONERROR, true); curl_setopt($ch, CURLOPT_AUTOREFERER, true); curl_setopt($ch, CURLOPT_RETURNTRANSFER,true); curl_setopt($ch, CURLOPT_TIMEOUT, 10); $html = curl_exec($ch); $html = preg_replace("#(<\s*a\s+[^>]*href\s*=\s*[\"'])(?!http)([^\"'>]+) ([\"'>]+)#",'$1'.$target_url.'$2$3', $html); echo $html; curl_close($ch); var_dump($ch); ?>过滤网址

.fla

PHP Curl - 400 Bad Request

1 个答案: