PHP Regex获取youtube视频ID?

时间:2010-08-03 01:25:11

标签: php regex

有人可以告诉我如何从网址中获取youtube id,无论网址中有哪些其他GET变量。

以此视频为例:http://www.youtube.com/watch?v=C4kxS1ksqtw&feature=related
因此,在v=和下一个&

之间

19 个答案:

答案 0 :(得分:287)

Use parse_url()parse_str()

(您可以使用正则表达式来处理任何事情,但它们很容易出错,因此如果有专门针对您要完成的内容的PHP函数,请使用它们。)

parse_url接受一个字符串并将其切割成一个包含大量信息的数组。您可以使用此数组,也可以将所需的项目指定为第二个参数。在这种情况下,我们对查询感兴趣,即PHP_URL_QUERY

现在我们有了v=C4kxS1ksqtw&feature=relate的查询,但我们只想要v=之后的部分。为此,我们转向parse_str,它基本上像字符串GET一样工作。它接受一个字符串并创建字符串中指定的变量。在这种情况下,会创建$v$feature。我们只对$v感兴趣。

为安全起见,您不希望仅存储命名空间中parse_url的所有变量(请参阅mellowsoon的评论)。而是将变量存储为数组的元素,以便您可以控制存储的变量,并且不会意外覆盖现有变量。

把所有东西放在一起,我们有:

<?php
$url = "http://www.youtube.com/watch?v=C4kxS1ksqtw&feature=relate";
parse_str( parse_url( $url, PHP_URL_QUERY ), $my_array_of_vars );
echo $my_array_of_vars['v'];    
  // Output: C4kxS1ksqtw
?> 

Working example


编辑:

嘿嘿 - 谢谢查尔斯。这让我发笑,我以前从未见过Zawinski的话:

Some people, when confronted with a problem, think ‘I know, I’ll use regular expressions.’ Now they have two problems. - Jamie Zawinski

答案 1 :(得分:144)

preg_match("#(?<=v=)[a-zA-Z0-9-]+(?=&)|(?<=v\/)[^&\n]+|(?<=v=)[^&\n]+|(?<=youtu.be/)[^&\n]+#", $url, $matches);

这将占

youtube.com/v/{vidid}
youtube.com/vi/{vidid}
youtube.com/?v={vidid}
youtube.com/?vi={vidid}
youtube.com/watch?v={vidid}
youtube.com/watch?vi={vidid}
youtu.be/{vidid}

我略微改进了以支持: http://www.youtube.com/v/5xADESocujo?feature=autoshare&version=3&autohide=1&autoplay=1

我现在使用的是:

preg_match("#(?<=v=)[a-zA-Z0-9-]+(?=&)|(?<=v\/)[^&\n]+(?=\?)|(?<=v=)[^&\n]+|(?<=youtu.be/)[^&\n]+#", $link, $matches);

答案 2 :(得分:92)

基于bokor对Anthony的回答的评论:

preg_match("/^(?:http(?:s)?:\/\/)?(?:www\.)?(?:m\.)?(?:youtu\.be\/|youtube\.com\/(?:(?:watch)?\?(?:.*&)?v(?:i)?=|(?:embed|v|vi|user)\/))([^\?&\"'>]+)/", $url, $matches);

$matches[1]包含vidid

匹配

不匹配:

  • www.facebook.com?wtv=youtube.com/v/vidid

答案 3 :(得分:32)

使用parse_strparse_url可以很容易地完成此操作,并且在我看来更可靠。

我的功能支持以下网址:

还包括功能下面的测试。

/**
 * Get Youtube video ID from URL
 *
 * @param string $url
 * @return mixed Youtube video ID or FALSE if not found
 */
function getYoutubeIdFromUrl($url) {
    $parts = parse_url($url);
    if(isset($parts['query'])){
        parse_str($parts['query'], $qs);
        if(isset($qs['v'])){
            return $qs['v'];
        }else if(isset($qs['vi'])){
            return $qs['vi'];
        }
    }
    if(isset($parts['path'])){
        $path = explode('/', trim($parts['path'], '/'));
        return $path[count($path)-1];
    }
    return false;
}
// Test
$urls = array(
    'http://youtube.com/v/dQw4w9WgXcQ?feature=youtube_gdata_player',
    'http://youtube.com/vi/dQw4w9WgXcQ?feature=youtube_gdata_player',
    'http://youtube.com/?v=dQw4w9WgXcQ&feature=youtube_gdata_player',
    'http://www.youtube.com/watch?v=dQw4w9WgXcQ&feature=youtube_gdata_player',
    'http://youtube.com/?vi=dQw4w9WgXcQ&feature=youtube_gdata_player',
    'http://youtube.com/watch?v=dQw4w9WgXcQ&feature=youtube_gdata_player',
    'http://youtube.com/watch?vi=dQw4w9WgXcQ&feature=youtube_gdata_player',
    'http://youtu.be/dQw4w9WgXcQ?feature=youtube_gdata_player'
);
foreach($urls as $url){
    echo $url . ' : ' . getYoutubeIdFromUrl($url) . "\n";
}

答案 4 :(得分:25)

解决方案对于任何链接类型!!

<?php
function get_youtube_id_from_url($url)  {
     preg_match('/(http(s|):|)\/\/(www\.|)yout(.*?)\/(embed\/|watch.*?v=|)([a-z_A-Z0-9\-]{11})/i', $url, $results);    return $results[6];
}


echo get_youtube_id_from_url('http://www.youtube.com/watch?var1=blabla#v=GvJehZx3eQ1$var2=bla');
      // or                   http://youtu.be/GvJehZx3eQ1 
      // or                   http://www.youtube.com/embed/GvJehZx3eQ1
      // or                   http://www.youtu.be/GvJehZx3eQ1/blabla?xyz
?>

输出 GvJehZx3eQ1

答案 5 :(得分:12)

根据How to validate youtube video ids?

修正
<?php

$links = [
    "youtube.com/v/tFad5gHoBjY",
    "youtube.com/vi/tFad5gHoBjY",
    "youtube.com/?v=tFad5gHoBjY",
    "youtube.com/?vi=tFad5gHoBjY",
    "youtube.com/watch?v=tFad5gHoBjY",
    "youtube.com/watch?vi=tFad5gHoBjY",
    "youtu.be/tFad5gHoBjY",
    "http://youtu.be/qokEYBNWA_0?t=30m26s",
    "youtube.com/v/vidid",
    "youtube.com/vi/vidid",
    "youtube.com/?v=vidid",
    "youtube.com/?vi=vidid",
    "youtube.com/watch?v=vidid",
    "youtube.com/watch?vi=vidid",
    "youtu.be/vidid",
    "youtube.com/embed/vidid",
    "http://youtube.com/v/vidid",
    "http://www.youtube.com/v/vidid",
    "https://www.youtube.com/v/vidid",
    "youtube.com/watch?v=vidid&wtv=wtv",
    "http://www.youtube.com/watch?dev=inprogress&v=vidid&feature=related",
    "youtube.com/watch?v=7HCZvhRAk-M"
];

foreach($links as $link){
    preg_match("#([\/|\?|&]vi?[\/|=]|youtu\.be\/|embed\/)([a-zA-Z0-9_-]+)#", $link, $matches);
    var_dump(end($matches));
}

答案 6 :(得分:9)

我们知道视频ID长度为11个字符,可以在v=vi=v/vi/youtu.be/之前显示。所以最简单的方法是:

<?php
$youtube = 'http://youtube.com/v/dQw4w9WgXcQ?feature=youtube_gdata_player
http://youtube.com/vi/dQw4w9WgXcQ?feature=youtube_gdata_player
http://youtube.com/?v=dQw4w9WgXcQ&feature=youtube_gdata_player
http://www.youtube.com/watch?v=dQw4w9WgXcQ&feature=youtube_gdata_player
http://youtube.com/?vi=dQw4w9WgXcQ&feature=youtube_gdata_player
http://youtube.com/watch?v=dQw4w9WgXcQ&feature=youtube_gdata_player
http://youtube.com/watch?vi=dQw4w9WgXcQ&feature=youtube_gdata_player
http://youtu.be/dQw4w9WgXcQ?feature=youtube_gdata_player';

preg_match_all("#(?<=v=|v\/|vi=|vi\/|youtu.be\/)[a-zA-Z0-9_-]{11}#", $youtube, $matches);

var_dump($matches[0]);

输出:

array(8) {
  [0]=>
  string(11) "dQw4w9WgXcQ"
  [1]=>
  string(11) "dQw4w9WgXcQ"
  [2]=>
  string(11) "dQw4w9WgXcQ"
  [3]=>
  string(11) "dQw4w9WgXcQ"
  [4]=>
  string(11) "dQw4w9WgXcQ"
  [5]=>
  string(11) "dQw4w9WgXcQ"
  [6]=>
  string(11) "dQw4w9WgXcQ"
  [7]=>
  string(11) "dQw4w9WgXcQ"
}

答案 7 :(得分:5)

以下内容适用于所有youtube链接

<?php
    // Here is a sample of the URLs this regex matches: (there can be more content after the given URL that will be ignored)
    // http://youtu.be/dQw4w9WgXcQ
    // http://www.youtube.com/embed/dQw4w9WgXcQ
    // http://www.youtube.com/watch?v=dQw4w9WgXcQ
    // http://www.youtube.com/?v=dQw4w9WgXcQ
    // http://www.youtube.com/v/dQw4w9WgXcQ
    // http://www.youtube.com/e/dQw4w9WgXcQ
    // http://www.youtube.com/user/username#p/u/11/dQw4w9WgXcQ
    // http://www.youtube.com/sandalsResorts#p/c/54B8C800269D7C1B/0/dQw4w9WgXcQ
    // http://www.youtube.com/watch?feature=player_embedded&v=dQw4w9WgXcQ
    // http://www.youtube.com/?feature=player_embedded&v=dQw4w9WgXcQ
    // It also works on the youtube-nocookie.com URL with the same above options.
    // It will also pull the ID from the URL in an embed code (both iframe and object tags)

$url = "https://www.youtube.com/watch?v=v2_MLFVdlQM";

    preg_match('%(?:youtube(?:-nocookie)?\.com/(?:[^/]+/.+/|(?:v|e(?:mbed)?)/|.*[?&]v=)|youtu\.be/)([^"&?/ ]{11})%i', $url, $match);

    $youtube_id = $match[1];

echo $youtube_id;
    ?>

答案 8 :(得分:4)

我有一些帖子内容,我必须加密才能获得Youtube ID。它碰巧采用Youtube提供的<iframe>嵌入代码的形式。

 <iframe src="http://www.youtube.com/embed/Zpk8pMz_Kgw?rel=0" frameborder="0" width="620" height="360"></iframe>

我从@rob上面得到的以下模式。一旦找到匹配项,代码段会执行foreach循环,为了获得额外的奖励,我将其链接到Youtube上的预览图像。它可能会匹配更多类型的Youtube嵌入类型和网址:

$pattern = '#(?<=(?:v|i)=)[a-zA-Z0-9-]+(?=&)|(?<=(?:v|i)\/)[^&\n]+|(?<=embed\/)[^"&\n]+|(?<=‌​(?:v|i)=)[^&\n]+|(?<=youtu.be\/)[^&\n]+#';

preg_match_all($pattern, $post_content, $matches);

foreach ($matches as $match) {
    $img = "<img src='http://img.youtube.com/vi/".str_replace('?rel=0','', $match[0])."/0.jpg' />";
    break;
}

Rob的个人资料:https://stackoverflow.com/users/149615/rob

答案 9 :(得分:4)

(?<=\?v=)([a-zA-Z0-9_-]){11}

这也应该这样做。

答案 10 :(得分:4)

if (preg_match('![?&]{1}v=([^&]+)!', $url . '&', $m))
    $video_id = $m[1];

答案 11 :(得分:2)

我知道线程的标题是指正则表达式的使用,但正如Zawinski引用的那样,我真的认为避免正则表达式在这里是最好的。我推荐这个功能:

function get_youtube_id($url)
{
    if (strpos( $url,"v=") !== false)
    {
        return substr($url, strpos($url, "v=") + 2, 11);
    }
    elseif(strpos( $url,"embed/") !== false)
    {
        return substr($url, strpos($url, "embed/") + 6, 11);
    }

}

我建议这样做,因为YouTube视频的ID始终相同,与网址的风格无关,例如:

  • http://www.youtube.com/watch?v=t_uW44Bsezg
  • http://www.youtube.com/watch?feature=endscreen&v=Id3xG4xnOfA&NR=1
  • `和其他Ulr形式,其中“嵌入/”字放在Id之前...... !!

可能是嵌入式和iframe - ed的情况。

答案 12 :(得分:2)

$vid = preg_replace('/^.*(\?|\&)v\=/', '', $url);  // Strip all meuk before and including '?v=' or '&v='.

$vid = preg_replace('/[^\w\-\_].*$/', '', $vid);  // Strip trailing meuk.

答案 13 :(得分:1)

要提取捕获组中的id,也可以选择以下表达式或该表达式的某些派生形式:

(?im)\b(?:https?:\/\/)?(?:w{3}\.)?youtu(?:be)?\.(?:com|be)\/(?:(?:\??v=?i?=?\/?)|watch\?vi?=|watch\?.*?&v=|embed\/|)([A-Z0-9_-]{11})\S*(?=\s|$)

Demo

测试

$re = '/(?im)\b(?:https?:\/\/)?(?:w{3}\.)?youtu(?:be)?\.(?:com|be)\/(?:(?:\??v=?i?=?\/?)|watch\?vi?=|watch\?.*?&v=|embed\/|)([A-Z0-9_-]{11})\S*(?=\s|$)/';
$str = 'http://youtube.com/v/tFad5gHoBjY
https://youtube.com/vi/tFad5gHoBjY
http://www.youtube.com/?v=tFad5gHoBjY
http://www.youtube.com/?vi=tFad5gHoBjY
https://www.youtube.com/watch?v=tFad5gHoBjY
youtube.com/watch?vi=tFad5gHoBjY
youtu.be/tFad5gHoBjY
http://youtu.be/qokEYBNWA_0?t=30m26s
youtube.com/v/7HCZvhRAk-M
youtube.com/vi/7HCZvhRAk-M
youtube.com/?v=7HCZvhRAk-M
youtube.com/?vi=7HCZvhRAk-M
youtube.com/watch?v=7HCZvhRAk-M
youtube.com/watch?vi=7HCZvhRAk-M
youtu.be/7HCZvhRAk-M
youtube.com/embed/7HCZvhRAk-M
http://youtube.com/v/7HCZvhRAk-M
http://www.youtube.com/v/7HCZvhRAk-M
https://www.youtube.com/v/7HCZvhRAk-M
youtube.com/watch?v=7HCZvhRAk-M&wtv=wtv
http://www.youtube.com/watch?dev=inprogress&v=7HCZvhRAk-M&feature=related
youtube.com/watch?v=7HCZvhRAk-M
http://youtube.com/v/dQw4w9WgXcQ?feature=youtube_gdata_player
http://youtube.com/vi/dQw4w9WgXcQ?feature=youtube_gdata_player
http://youtube.com/?v=dQw4w9WgXcQ&feature=youtube_gdata_player
http://www.youtube.com/watch?v=dQw4w9WgXcQ&feature=youtube_gdata_player
http://youtube.com/?vi=dQw4w9WgXcQ&feature=youtube_gdata_player
http://youtube.com/watch?v=dQw4w9WgXcQ&feature=youtube_gdata_player
http://youtube.com/watch?vi=dQw4w9WgXcQ&feature=youtube_gdata_player
http://youtu.be/dQw4w9WgXcQ?feature=youtube_gdata_player';

preg_match_all($re, $str, $matches, PREG_SET_ORDER, 0);

var_dump($matches);

输出

array(30) {
  [0]=>
  array(2) {
    [0]=>
    string(32) "http://youtube.com/v/tFad5gHoBjY"
    [1]=>
    string(11) "tFad5gHoBjY"
  }
  [1]=>
  array(2) {
    [0]=>
    string(34) "https://youtube.com/vi/tFad5gHoBjY"
    [1]=>
    string(11) "tFad5gHoBjY"
  }
  [2]=>
  array(2) {
    [0]=>
    string(37) "http://www.youtube.com/?v=tFad5gHoBjY"
    [1]=>
    string(11) "tFad5gHoBjY"
  }
  [3]=>
  array(2) {
    [0]=>
    string(38) "http://www.youtube.com/?vi=tFad5gHoBjY"
    [1]=>
    string(11) "tFad5gHoBjY"
  }
  [4]=>
  array(2) {
    [0]=>
    string(43) "https://www.youtube.com/watch?v=tFad5gHoBjY"
    [1]=>
    string(11) "tFad5gHoBjY"
  }
  [5]=>
  array(2) {
    [0]=>
    string(32) "youtube.com/watch?vi=tFad5gHoBjY"
    [1]=>
    string(11) "tFad5gHoBjY"
  }
  [6]=>
  array(2) {
    [0]=>
    string(20) "youtu.be/tFad5gHoBjY"
    [1]=>
    string(11) "tFad5gHoBjY"
  }
  [7]=>
  array(2) {
    [0]=>
    string(27) "http://youtu.be/qokEYBNWA_0"
    [1]=>
    string(11) "qokEYBNWA_0"
  }
  [8]=>
  array(2) {
    [0]=>
    string(25) "youtube.com/v/7HCZvhRAk-M"
    [1]=>
    string(11) "7HCZvhRAk-M"
  }
  [9]=>
  array(2) {
    [0]=>
    string(26) "youtube.com/vi/7HCZvhRAk-M"
    [1]=>
    string(11) "7HCZvhRAk-M"
  }
  [10]=>
  array(2) {
    [0]=>
    string(26) "youtube.com/?v=7HCZvhRAk-M"
    [1]=>
    string(11) "7HCZvhRAk-M"
  }
  [11]=>
  array(2) {
    [0]=>
    string(27) "youtube.com/?vi=7HCZvhRAk-M"
    [1]=>
    string(11) "7HCZvhRAk-M"
  }
  [12]=>
  array(2) {
    [0]=>
    string(31) "youtube.com/watch?v=7HCZvhRAk-M"
    [1]=>
    string(11) "7HCZvhRAk-M"
  }
  [13]=>
  array(2) {
    [0]=>
    string(32) "youtube.com/watch?vi=7HCZvhRAk-M"
    [1]=>
    string(11) "7HCZvhRAk-M"
  }
  [14]=>
  array(2) {
    [0]=>
    string(20) "youtu.be/7HCZvhRAk-M"
    [1]=>
    string(11) "7HCZvhRAk-M"
  }
  [15]=>
  array(2) {
    [0]=>
    string(29) "youtube.com/embed/7HCZvhRAk-M"
    [1]=>
    string(11) "7HCZvhRAk-M"
  }
  [16]=>
  array(2) {
    [0]=>
    string(32) "http://youtube.com/v/7HCZvhRAk-M"
    [1]=>
    string(11) "7HCZvhRAk-M"
  }
  [17]=>
  array(2) {
    [0]=>
    string(36) "http://www.youtube.com/v/7HCZvhRAk-M"
    [1]=>
    string(11) "7HCZvhRAk-M"
  }
  [18]=>
  array(2) {
    [0]=>
    string(37) "https://www.youtube.com/v/7HCZvhRAk-M"
    [1]=>
    string(11) "7HCZvhRAk-M"
  }
  [19]=>
  array(2) {
    [0]=>
    string(31) "youtube.com/watch?v=7HCZvhRAk-M"
    [1]=>
    string(11) "7HCZvhRAk-M"
  }
  [20]=>
  array(2) {
    [0]=>
    string(57) "http://www.youtube.com/watch?dev=inprogress&v=7HCZvhRAk-M"
    [1]=>
    string(11) "7HCZvhRAk-M"
  }
  [21]=>
  array(2) {
    [0]=>
    string(31) "youtube.com/watch?v=7HCZvhRAk-M"
    [1]=>
    string(11) "7HCZvhRAk-M"
  }
  [22]=>
  array(2) {
    [0]=>
    string(32) "http://youtube.com/v/dQw4w9WgXcQ"
    [1]=>
    string(11) "dQw4w9WgXcQ"
  }
  [23]=>
  array(2) {
    [0]=>
    string(33) "http://youtube.com/vi/dQw4w9WgXcQ"
    [1]=>
    string(11) "dQw4w9WgXcQ"
  }
  [24]=>
  array(2) {
    [0]=>
    string(33) "http://youtube.com/?v=dQw4w9WgXcQ"
    [1]=>
    string(11) "dQw4w9WgXcQ"
  }
  [25]=>
  array(2) {
    [0]=>
    string(42) "http://www.youtube.com/watch?v=dQw4w9WgXcQ"
    [1]=>
    string(11) "dQw4w9WgXcQ"
  }
  [26]=>
  array(2) {
    [0]=>
    string(34) "http://youtube.com/?vi=dQw4w9WgXcQ"
    [1]=>
    string(11) "dQw4w9WgXcQ"
  }
  [27]=>
  array(2) {
    [0]=>
    string(38) "http://youtube.com/watch?v=dQw4w9WgXcQ"
    [1]=>
    string(11) "dQw4w9WgXcQ"
  }
  [28]=>
  array(2) {
    [0]=>
    string(39) "http://youtube.com/watch?vi=dQw4w9WgXcQ"
    [1]=>
    string(11) "dQw4w9WgXcQ"
  }
  [29]=>
  array(2) {
    [0]=>
    string(27) "http://youtu.be/dQw4w9WgXcQ"
    [1]=>
    string(11) "dQw4w9WgXcQ"
  }
}

如果您希望简化/修改/探索表达式,请在regex101.com的右上角进行说明。如果愿意,您还可以在this link中查看它如何与某些示例输入匹配。


RegEx电路

jex.im可视化正则表达式:

enter image description here


答案 14 :(得分:1)

使用此代码:

$url = "http://www.youtube.com/watch?v=C4kxS1ksqtw&feature=related"; 
$parse = parse_url($url, PHP_URL_QUERY); 
parse_str($parse, $output); 
echo $output['watch'];

结果:C4kxS1ksqtw

答案 15 :(得分:0)

刚刚在http://snipplr.com/view/62238/get-youtube-video-id-very-robust/

在线发现了这个问题
%let infile=/.../file.csv;
%let outfile=/.../new_file.csv;

data _null_ ;
  infile "&infile" dsd dlm='|' lrecl=200000 truncover ; 
  file "&outfile" dsd dlm='|' lrecl=200000 ;
  length var1-var5 $32767 ;
  input var1-var5 ;
  put (var1-var3 var5) (+0) ;
run;

答案 16 :(得分:0)

我使用了Shawn's answer的数据,但对它进行了概括并缩短了正则表达式。此功能的主要区别在于,它不会测试有效的Youtube URL,而只会查找视频ID。表示它仍将返回www.facebook.com?wtv=youtube.com/v/vidid的视频ID。适用于所有测试用例,但有点松懈。因此,它将为https://www.twitter.com/watch?v=vidid之类的东西输出误报。如果数据超级不一致,请使用此方法,否则请使用更特定的正则表达式或parse_url()parse_str()

preg_match("/([\?&\/]vi?|embed|\.be)[\/=]([\w-]+)/",$url,$matches);
print($matches[2]);

答案 17 :(得分:0)

我认为您正在尝试这样做。

functions.php

答案 18 :(得分:0)

,如果我想从充满其他字符的字符串中提取youtue网址怎么办?像这样:

Lorem ipsum dolor坐下,奉献己任,sius do eiusmod tempor incididunt ut Labore et dolore magna aliqua。尽量不要抽烟,不要打扰锻炼ullamco https://www.youtube.com/watch?v=cPW9Y94BJI0 ,这是普遍的劳动成果。 Duis aute irure dolor in reprehenderit in volttable velit esse cillum dolore eu fugiat nulla pariatur。不会出现意外的圣人,反而会在犯规的情况下动手动手。

并从该字符串中获取https://www.youtube.com/watch?v=cPW9Y94BJI0