我正在尝试编写重定向检查程序,以检查URL是否对搜索引擎友好。它必须检查URL是否被重定向,如果它被重定向,它必须告诉它是否是SEO友好(301状态代码)或不(302/304)。
以下是我发现的类似内容:http://www.webconfs.com/redirect-check.php
它也应该能够遵循多个重定向(例如从A到B到C)并告诉我A重定向到C.
这是我到目前为止所做的,但它不能正常工作(例如:在www.example.com上输入时,它没有找到重定向到www.example.com/page1)
<?php
// You can edit the messages of the respective code over here
$httpcode = array();
$httpcode["200"] = "Ok";
$httpcode["201"] = "Created";
$httpcode["302"] = "Found";
$httpcode["301"] = "Moved Permanently";
$httpcode["304"] = "Not Modified";
$httpcode["400"] = "Bad Request";
if(count($_POST)>0)
{
$url = $_POST["url"];
$curlurl = "http://".$url."/";
$ch = curl_init();
// Set URL to download
curl_setopt($ch, CURLOPT_URL, $curlurl);
// User agent
curl_setopt($ch, CURLOPT_USERAGENT, $_SERVER["HTTP_USER_AGENT"]);
// Include header in result? (0 = yes, 1 = no)
curl_setopt($ch, CURLOPT_HEADER, 0);
// Should cURL return or print out the data? (true = return, false = print)
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
// Timeout in seconds
curl_setopt($ch, CURLOPT_TIMEOUT, 15);
// Download the given URL, and return output
$output = curl_exec($ch);
$curlinfo = curl_getinfo($ch);
if(($curlinfo["http_code"]=="301") || ($curlinfo["http_code"]=="302"))
{
$ch = curl_init();
// Set URL to download
curl_setopt($ch, CURLOPT_URL, $curlurl);
// User agent
curl_setopt($ch, CURLOPT_USERAGENT, $_SERVER["HTTP_USER_AGENT"]);
// Include header in result? (0 = yes, 1 = no)
curl_setopt($ch, CURLOPT_HEADER, 0);
// Should cURL return or print out the data? (true = return, false = print)
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
// Timeout in seconds
curl_setopt($ch, CURLOPT_TIMEOUT, 15);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
// Download the given URL, and return output
$output = curl_exec($ch);
$curlinfo = curl_getinfo($ch);
echo $url." is redirected to ".$curlinfo["url"];
}
else
{
echo $url." is not getting redirected";
}
// Close the cURL resource, and free system resources
curl_close($ch);
}
?>
<form action="" method="post">
http://<input type="text" name="url" size="30" />/ <b>e.g. www.google.com</b><br/>
<input type="submit" value="Submit" />
</form>
答案 0 :(得分:7)
如果您想记录每个重定向,您必须自己实施并关闭自动“位置跟踪”:
function curl_trace_redirects($url, $timeout = 15) {
$result = array();
$ch = curl_init();
$trace = true;
$currentUrl = $url;
$urlHist = array();
while($trace && $timeout > 0 && !isset($urlHist[$currentUrl])) {
$urlHist[$currentUrl] = true;
curl_setopt($ch, CURLOPT_URL, $currentUrl);
curl_setopt($ch, CURLOPT_HEADER, true);
curl_setopt($ch, CURLOPT_NOBODY, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, false);
curl_setopt($ch, CURLOPT_TIMEOUT, $timeout);
$output = curl_exec($ch);
if($output === false) {
$traceItem = array(
'errorno' => curl_errno($ch),
'error' => curl_error($ch),
);
$trace = false;
} else {
$curlinfo = curl_getinfo($ch);
if(isset($curlinfo['total_time'])) {
$timeout -= $curlinfo['total_time'];
}
if(!isset($curlinfo['redirect_url'])) {
$curlinfo['redirect_url'] = get_redirect_url($output);
}
if(!empty($curlinfo['redirect_url'])) {
$currentUrl = $curlinfo['redirect_url'];
} else {
$trace = false;
}
$traceItem = $curlinfo;
}
$result[] = $traceItem;
}
if($timeout < 0) {
$result[] = array('timeout' => $timeout);
}
curl_close($ch);
return $result;
}
// apparently 'redirect_url' is not available on all curl-versions
// so we fetch the location header ourselves
function get_redirect_url($header) {
if(preg_match('/^Location:\s+(.*)$/mi', $header, $m)) {
return trim($m[1]);
}
return "";
}
你就这样使用它:
$res = curl_trace_redirects("http://www.example.com");
foreach($res as $item) {
if(isset($item['timeout'])) {
echo "Timeout reached!\n";
} else if(isset($item['error'])) {
echo "error: ", $item['error'], "\n";
} else {
echo $item['url'];
if(!empty($item['redirect_url'])) {
// redirection
echo " -> (", $item['http_code'], ")";
}
echo "\n";
}
}
我的代码可能没有经过深思熟虑,但我想这是一个好的开始。
修改强>
以下是一些示例输出:
http://midas/~stefan/test/redirect/fritzli.html -> (302)
http://midas/~stefan/test/redirect/hansli.html -> (301)
http://midas/~stefan/test/redirect/heiri.html