正则表达式单元测试通过,但似乎没有正常工作尝试使用它

时间:2016-05-18 13:05:33

标签: php regex curl

This is a link to the String in a linter.

这就是表达本身:

(?i)\b((?:https?:\/\/|www\d{0,3}[.]|[a-z0-9.\-]+[.][a-z]{2,4}\/)(?:[^\s()<>]+|\(([^\s()<>]+|(\([^\s()<>]+\)))*\))+(?:\(([^\s()<>]+|(\([^\s()<>]+\)))*\)|[^\s`!()\[\]{};:'\".,<>?«»“”‘’]))

我正在尝试使用此表达式验证几乎任何网址。

我们在这里可以看到它按预期通过了单元测试:

enter image description here

然而正如我所说,当我尝试运行我的代码时,似乎忽略了验证......让我摸不着头脑。

这些是代码的相关部分:

//kindly taken from here: http://stackoverflow.com/a/34589895/2226328
function checkPageSpeed($url){    
    if (function_exists('file_get_contents')) {    
        $result = @file_get_contents($url);
    }   

    if ($result == '') {    
        $ch = curl_init();    
        $timeout = 60;    
        curl_setopt($ch, CURLOPT_URL, $url);
        curl_setopt($ch, CURLOPT_HEADER,1);//get the header
        curl_setopt($ch, CURLOPT_NOBODY,1);//and *only* get the header    
        curl_setopt($ch, CURLOPT_RETURNTRANSFER,1);//get the response as a string from curl_exec(), rather than echoing it
        curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);  
        curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);  
        curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0);  
        curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
        curl_setopt($ch, CURLOPT_FRESH_CONNECT,1);//don't use a cached version of the url    

        $result = curl_exec($ch);    
        curl_close($ch);    
     }    
    return $result;    
}  

function pingGoogle($url){

    echo "<h1>".$url."</h1>";

    if(strtolower(substr($url, 0, 4)) !== "http") {
        echo "adding http:// to $url <br/>";
        $url = "http://".$url;
        echo "URL is now $url <br/>";
    } 

    //original idea from https://gist.github.com/dperini/729294
    $re = "/(?i)\\b((?:https?:\\/\\/|www\\d{0,3}[.]|[a-z0-9.\\-]+[.][a-z]{2,4}\\/)(?:[^\\s()<>]+|\\(([^\\s()<>]+|(\\([^\\s()<>]+\\)))*\\))+(?:\\(([^\\s()<>]+|(\\([^\\s()<>]+\\)))*\\)|[^\\s`!()\\[\\]{};:'\\\".,<>?«»“”‘’]))/"; 

    $test = preg_match($re, $url);  
    var_export($test);

    if( $test === 1) { 
        echo "$url passes pattern Test...let's check if it's actually valid ..."; 

        pingGoogle("hjm.google.cm/");
        pingGoogle("gamefaqs.com");
    }
    else 
    { 
        echo  "URL formatted proper but isn't an active URL! <br/>"; 
    }
}

2 个答案:

答案 0 :(得分:0)

圣洁的moly,这是一个正则半数......

考虑使用parse_url让PHP为您进行处理。由于您只对域名感兴趣,请尝试:

$host = parse_url($url, PHP_URL_HOST);
if( $host === null) {
    echo "Failed to parse, no host found";
}
else {
    // do something with supposed host here
}

答案 1 :(得分:0)

您是否考虑过只使用PHP内置的验证过滤器FILTER_VALIDATE_URL以及filter_var()?在简化代码和性能方面,它可能比滚动您自己的基于正则表达式的解决方案更好。

http://php.net/manual/en/function.filter-var.php

http://php.net/manual/en/filter.filters.validate.php