如何匹配可以以查询字符串结尾的URL中的文件扩展名?

时间:2014-09-16 00:04:54

标签: javascript php regex

我希望将所有以下网址中的文件扩展名与问号相匹配。所以URL#4将匹配" file.pdf"中的pdf。但不是" exe" in" otherfile.exe"。

http://www.someplace.com/directory/file.pdf
http://www.someplace.com/directory/file.pdf?otherstuff=true
http://www.someplace.com/directory/file.pdf?other=true&more=false
http://www.someplace.com/directory/file.pdf?other=true&more=false&value=otherfile.exe

我该怎么做?

我尝试了这个,但它无法正常工作:

([^\.]+)(\?|[^\?]$)+

3 个答案:

答案 0 :(得分:6)

这将是我将使用的版本

/\w+\.[A-Za-z]{3,4}(?=\?|$)/

这是一个工作版本:

http://regex101.com/r/sY2fR0/1

使用前瞻两种方式?或者字符串(?=\?|$)的结尾然后您可以匹配它背后的内容。

$re = "/\\w+\\.[A-Za-z]{3,4}(?=\\?|$)/"; 
$str = "http://www.someplace.com/directory/file.pdf?other=true&more=false&value=otherfile.exe\n\n"; 

preg_match($re, $str, $matches);

答案 1 :(得分:0)

对于匹配尝试这种不区分大小写的函数:

function matchURLs($desiredURL, $compareURL){
  $url = parse_url($compareURL);
  if(preg_match('/^'$url['scheme'].'://'.$url['host'].$url['path'].'$/i', $desiredURL)){
    return true;
  }
  return false;
}
matchURLs('http://www.someplace.com/directory/file.pdf', 'http://www.someplace.com/directory/file.exe'); // false
matchURLs('http://www.someplace.com/directory/file.pdf', 'http://www.someplace.com/directory/file.pdf?value=file.exe'); // true

要在?

之前获取
function URL_before_query($url){
  $u = parse_url($url);
  return $u['scheme'].'://'.$u['host'].$u['path'];
}
echo URL_before_query('http://www.someplace.com/directory/file.pdf?other=true&more=false&value=otherfile.exe'); // http://www.someplace.com/directory/file.pdf

答案 2 :(得分:0)

<?
$str = '
http://www.someplace.com/directory/file1.pdf
http://www.someplace.com/directory/file2.pdf?otherstuff=true 
http://www.someplace.com/directory/file3.pdf?other=true&more=false 
http://www.someplace.com/directory/file4.pdf?other=true&more=false&value=otherfile.exe
';
$regex= '~.*/\K[^?\n]+~';

preg_match_all($regex, $str, $out, PREG_SET_ORDER);
print_r($out);
?>

<强>输出

Array ( 
   [0] => Array ( 
                 [0] => file1.pdf 
                ) 
   [1] => Array ( 
                 [0] => file2.pdf 
                ) 
   [2] => Array ( 
                 [0] => file3.pdf 
                ) 
   [3] => Array ( 
                 [0] => file4.pdf 
                 ) 
  )