这种情况下最好的正则表达式是什么?
鉴于此URL:
http://php.net/manual/en/function.preg-match.php
我应该如何选择http://php.net
和.php
之间的所有内容:
/manual/en/function.preg-match
这适用于Nginx配置文件。
答案 0 :(得分:20)
正则表达式可能不是这项工作最有效的工具。
尝试使用parse_url()
,并结合pathinfo()
:
$url = 'http://php.net/manual/en/function.preg-match.php';
$path = parse_url($url, PHP_URL_PATH);
$pathinfo = pathinfo($path);
echo $pathinfo['dirname'], '/', $pathinfo['filename'];
以上代码输出:
/manual/en/function.preg-match
答案 1 :(得分:8)
像这样:
if (preg_match('/(?<=net).*(?=\.php)/', $subject, $regs)) {
$result = $regs[0];
}
说明:
"
(?<= # Assert that the regex below can be matched, with the match ending at this position (positive lookbehind)
net # Match the characters “net” literally
)
. # Match any single character that is not a line break character
* # Between zero and unlimited times, as many times as possible, giving back as needed (greedy)
(?= # Assert that the regex below can be matched, starting at this position (positive lookahead)
\. # Match the character “.” literally
php # Match the characters “php” literally
)
"
答案 2 :(得分:3)
试试这个:
preg_match("/net(.*)\.php$/","http://php.net/manual/en/function.preg-match.php", $matches);
echo $matches[1];
// prints /manual/en/function.preg-match
答案 3 :(得分:3)
无需使用正则表达式来剖析URL。 PHP具有内置函数pathinfo()和parse_url()。
答案 4 :(得分:2)
只是为了它的乐趣,这里有两种尚未探索过的方法:
substr($url, strpos($s, '/', 8), -4)
或者:
substr($s, strpos($s, '/', 8), -strlen($s) + strrpos($s, '.'))
基于HTTP方案http://
和https://
最多为8个字符的想法,通常只需从第9个位置开始查找第一个斜杠即可。如果扩展名始终为.php
,则第一个代码将起作用,否则需要另一个代码。
对于纯正则表达式解决方案,您可以将字符串分解为:
~^(?:[^:/?#]+:)?(?://[^/?#]*)?([^?#]*)~
^
路径部分将位于第一个存储器组(即索引1)内,由表达式下面的行中的^
表示。可以使用pathinfo()
:
$parts = pathinfo($matches[1]);
echo $parts['dirname'] . '/' . $parts['filename'];
您还可以将表达式调整为:
([^?#]*?)(?:\.[^?#]*)?(?:\?|$)
这个表达式虽然不是很优,但因为它有一些反向跟踪。最后,我会选择不那么习惯的东西:
$parts = pathinfo(parse_url($url, PHP_URL_PATH));
echo $parts['dirname'] . '/' . $parts['filename'];
答案 5 :(得分:0)
此常规网址匹配允许您选择网址的一部分:
if (preg_match('/\\b(?P<protocol>https?|ftp):\/\/(?P<domain>[-A-Z0-9.]+)(?P<file>\/[-A-Z0-9+&@#\/%=~_|!:,.;]*)?(?P<parameters>\\?[-A-Z0-9+&@#\/%=~_|!:,.;]*)?/i', $subject, $regs)) {
$result = $regs['file'];
//or you can append the $regs['parameters'] too
} else {
$result = "";
}
答案 6 :(得分:0)
如果您问我:http://regex101.com/r/nQ8rH5
,这是一个比目前为止提供的正则表达式更好的解决方案/http:\/\/[^\/]+\K.*(?=\.[^.]+$)/i
答案 7 :(得分:0)
简单:
$url = "http://php.net/manual/en/function.preg-match.php";
preg_match("/http:\/\/php\.net(.+)\.php/", $url, $matches);
echo $matches[1];
$matches[0]
是您的完整网址,$matches[1]
是您想要的部分。
答案 8 :(得分:0)
re> |(?<=\w)/.+(?=\.\w+$)| Compile time 0.0011 milliseconds Memory allocation (code space): 32 Study time 0.0002 milliseconds Capturing subpattern count = 0 No options First char = '/' No need char Max lookbehind = 1 Subject length lower bound = 2 No set of starting bytes data> http://php.net/manual/en/function.preg-match.php Execute time 0.0007 milliseconds 0: /manual/en/function.preg-match
re> |//[^/]*(.*)\.\w+$| Compile time 0.0010 milliseconds Memory allocation (code space): 28 Study time 0.0002 milliseconds Capturing subpattern count = 1 No options First char = '/' Need char = '.' Subject length lower bound = 4 No set of starting bytes data> http://php.net/manual/en/function.preg-match.php Execute time 0.0005 milliseconds 0: //php.net/manual/en/function.preg-match.php 1: /manual/en/function.preg-match
re> |/[^/]+(.*)\.| Compile time 0.0008 milliseconds Memory allocation (code space): 23 Study time 0.0002 milliseconds Capturing subpattern count = 1 No options First char = '/' Need char = '.' Subject length lower bound = 3 No set of starting bytes data> http://php.net/manual/en/function.preg-match.php Execute time 0.0005 milliseconds 0: /php.net/manual/en/function.preg-match. 1: /manual/en/function.preg-match
re> |/[^/]+\K.*(?=\.)| Compile time 0.0009 milliseconds Memory allocation (code space): 22 Study time 0.0002 milliseconds Capturing subpattern count = 0 No options First char = '/' No need char Subject length lower bound = 2 No set of starting bytes data> http://php.net/manual/en/function.preg-match.php Execute time 0.0005 milliseconds 0: /manual/en/function.preg-match
re> |\w+\K/.*(?=\.)| Compile time 0.0009 milliseconds Memory allocation (code space): 22 Study time 0.0003 milliseconds Capturing subpattern count = 0 No options No first char Need char = '/' Subject length lower bound = 2 Starting byte set: 0 1 2 3 4 5 6 7 8 9 A B C D E F G H I J K L M N O P Q R S T U V W X Y Z _ a b c d e f g h i j k l m n o p q r s t u v w x y z data> http://php.net/manual/en/function.preg-match.php Execute time 0.0011 milliseconds 0: /manual/en/function.preg-match
答案 9 :(得分:-1)
用于匹配“net”之后和“.php”之前的所有内容的正则表达式:
$pattern = "net([a-zA-Z0-9_]*)\.php";
在上面的正则表达式中,您可以找到由“()”包围的匹配字符组,使其成为您要查找的字符。
希望它有用。
答案 10 :(得分:-1)
http:[\/]{2}.+?[.][^\/]+(.+)[.].+
让我们看看它做了什么:
http:[\/]{2}.+?[.][^\/]
- http://php.net
(.+)[.]
- 捕获部分,直到最后一个点出现:/manual/en/function.preg-match
[.].+
- 匹配文件扩展名,如下所示:.php