从URL链接中隔离数字ID

时间:2013-01-27 00:01:26

标签: php regex

我正在浏览stackoverflow并找到一个很棒的正则表达式代码HERE。可能还有其他方法可以隔离youtube视频ID,但我选择使用正则表达式进行学习。带input1的正则表达式代码(如下所示)忽略&字符后面的所有内容。这会清除视频ID,从而提供错误或空ID结果。为什么正则表达式会在&之后清除所有内容?

错误:

输入1:http://www.youtube.com/watch?feature&v=317a815FLWQ

结果1:http // www.youtube.com / watch?feature

正常:

输入2:http://www.youtube.com/watch?v=spDj54kf-vY&feature=g-vrec

结果2:http://www.youtube.com/watch?v=spDj54kf-vY

正则表达式(带原始评论)

$text = preg_replace('~
        # Match non-linked youtube URL in the wild. (Rev:20111012)
        https?://         # Required scheme. Either http or https.
        (?:[0-9A-Z-]+\.)? # Optional subdomain.
        (?:               # Group host alternatives.
          youtu\.be/      # Either youtu.be,
        | youtube\.com    # or youtube.com followed by
          \S*             # Allow anything up to VIDEO_ID,
          [^\w\-\s]       # but char before ID is non-ID char.
        )                 # End host alternatives.
        ([\w\-]{11})      # $1: VIDEO_ID is exactly 11 chars.
        (?=[^\w\-]|$)     # Assert next char is non-ID or EOS.
        (?!               # Assert URL is not pre-linked.
          [?=&+%\w]*      # Allow URL (query) remainder.
          (?:             # Group pre-linked alternatives.
            [\'"][^<>]*>  # Either inside a start tag,
          | </a>          # or inside <a> element text contents.
          )               # End recognized pre-linked alts.
        )                 # End negative lookahead assertion.
        [?=&+%\w-]*        # Consume any URL (query) remainder.
        ~ix', 
        '<a href="http://www.youtube.com/watch?v=$1">YouTube link: $1</a>',
        $text);
    return $text;

1 个答案:

答案 0 :(得分:6)

忘记regex,请使用parse_url

Array
(
    [scheme] => http
    [host] => hostname
    [user] => username
    [pass] => password
    [path] => /path
    [query] => arg=value
    [fragment] => anchor
)

然后在网址的query部分使用parse_str来提取变量。

修改

这是一个更好的演示:

$url = "http://www.youtube.com/watch?feature&v=317a815FLWQ";

$parsed_url = parse_url($url);
$query = $parsed_url['query'];

$parsed_query = array();
parse_str($query, $parsed_query);

var_dump($parsed_query);

输出:

array(2) {
  ["feature"]=>
  string(0) ""
  ["v"]=>
  string(11) "317a815FLWQ"
}

编辑2

另一个从评论中给出的第二个链接中提取ID的示例:

$url = "http://www.youtube.com/sandalsResorts#p/c/54B8C800269D7C1B/2/PPS-8DMrAn4";

$parsed_url = parse_url($url);
$fragment = $parsed_url['fragment'];
$fragment_parts = explode('/', $fragment);
$video_id = array_pop($fragment_parts);

print($video_id);

输出:

PPS-8DMrAn4

,如果您要求用户提供链接,则需要非常具体。第二个示例中的链接不是视频链接,但如果您想要原谅用户的输入,则可以通过两个代码段运行链接,并检查您是否获得了ID。