正则表达式使用CSV视频ID解析Youtube网址

时间:2016-08-22 05:21:16

标签: c# regex

我尝试抓取以逗号分隔格式显示的YouTube视频ID。

url:http://www.youtube.com/watch?v=ClcKC_U_7fM,ujmoYyEyDP8,cwRFjWdxeRQ,Z4BKV121mP4,T241s7O7-Io

我的预期输出:

ClcKC_U_7fM, ujmoYyEyDP8, cwRFjWdxeRQ, Z4BKV121mP4, T241s7O7-Io

我试过了

Regex regexPattern = new Regex(@"""[^""\r\n]*""|'[^'\r\n]*'|[^,\r\n]*");
    Match matchResults = regexPattern.Match(url);
    while (matchResults.Success) 
    {
        Console.WriteLine(matchResults.Value);
        matchResults = matchResults.NextMatch();
    }

输出

http://www.youtube.com/watch?v=ClcKC_U_7fM

ujmoYyEyDP8

cwRFjWdxeRQ

Z4BKV121mP4

T241s7O7-Io

我尝试了其他方法

 var regex = new Regex(@"(?:.+?)?(?:\\/v\\/|watch\\/|\\?v=|\\&v=|youtu\\.be\\/|\\/v=|^youtu\\.be\\/)([a-zA-Z0-9_-]{11})+");
            foreach (Match match in regex.Matches(url))
            {
                //Console.WriteLine(match);
                foreach (var groupdata in match.Groups.Cast<Group>().Where(groupdata => !groupdata.ToString().StartsWith("http://") && !groupdata.ToString().StartsWith("https://") && !groupdata.ToString().StartsWith("youtu") && !groupdata.ToString().StartsWith("www.")))
                {
                    groupdata.ToString();
                    Console.WriteLine(groupdata.ToString());
                }
            }

输出

ClcKC_U_7fM

获得以下结果的任何想法?

ClcKC_U_7fM, ujmoYyEyDP8, cwRFjWdxeRQ, Z4BKV121mP4, T241s7O7-Io
  

更新

我忘了提及各种网址格式

1) http://www.youtube.com/watch?v=ClcKC_U_7fM,ujmoYyEyDP8,cwRFjWdxeRQ,Z4BKV121mP4,T241s7O7-Io

2) http://www.youtube.com/embed/watch?feature=player_embedded&v=ClcKC_U_7fM,ujmoYyEyDP8,cwRFjWdxeRQ,Z4BKV121mP4,T241s7O7-Io

3) http://www.youtube.com/watch?v=ClcKC_U_7fM,ujmoYyEyDP8,cwRFjWdxeRQ,Z4BKV121mP4,T241s7O7-Io&feature=related

由于

3 个答案:

答案 0 :(得分:3)

使用OP发布的原始代码,将以下内容替换为正则表达式(?<=(v=)|,)[^(,|&)]*

string url = "http://www.youtube.com/watch?v=ClcKC_U_7fM,ujmoYyEyDP8,cwRFjWdxeRQ,Z4BKV121mP4,T241s7O7-Io";
Regex regexPattern = new Regex("(?<=(v=)|,)[^(,|&)]*");
Match matchResults = regexPattern.Match(url);
while (matchResults.Success)
{
    Console.WriteLine(matchResults.Value);
    matchResults = matchResults.NextMatch();
}

输出:

ClcKC_U_7fM
ujmoYyEyDP8
cwRFjWdxeRQ
Z4BKV121mP4
T241s7O7-Io

答案 1 :(得分:1)

string url = "http://www.youtube.com/embed/watch?feature=player_embedded&v=ClcKC_U_7fM,ujmoYyEyDP8,cwRFjWdxeRQ,Z4BKV121mP4,T241s7O7-Io";

string[] s1 = url.Split('?');
string[] queries = s1[1].Split('&');

foreach (string query in queries)
{
    if (query.ToLower().StartsWith("v="))
    {
        string[] s2 = query.Split('=');
        string[] s3 = s2[1].Split(',');

        foreach (string s in s3)
        {
            Console.WriteLine(s);
        }

        break;
    }
}

答案 2 :(得分:1)

如果您先将网址拆分为=,然后使用简单的字符串拆分,则可能会更容易。