无法理解语法href = \“\

时间:2017-12-23 04:11:44

标签: python-3.x youtube

我正在尝试编写一个简单的python脚本来使用youtube-dl搜索和下载youtube视频。我遇到了以下搜索视频ID的code。我无法理解以下内容:

search_results = re.findall(r'href=\"\/watch\?v=(.{11})', html_content.read().decode())

YouTube视频链接如下:https://www.youtube.com/watch?v=MJGkm0UwNRk 是否使用href = \“\表示跳过https://www.youtube.com部分并移至 / watch?v =< 11digit id> 或其他内容。

代码:

import urllib.request
import urllib.parse
import re

query_string = urllib.parse.urlencode({"search_query" : input()})
html_content = urllib.request.urlopen("http://www.youtube.com/results?" + query_string)
search_results = re.findall(r'href=\"\/watch\?v=(.{11})', html_content.read().decode())
print("http://www.youtube.com/watch?v=" + search_results[0])

1 个答案:

答案 0 :(得分:1)

您应该检查regular expression operations

这是regex101的解释:

"href=\"\/watch\?v=(.{11})"g

href= matches the characters href= literally (case sensitive)
\" matches the character " literally (case sensitive)
\/ matches the character / literally (case sensitive)
watch matches the characters watch literally (case sensitive)
\? matches the character ? literally (case sensitive)
v= matches the characters v= literally (case sensitive)

1st Capturing Group (.{11})
    .{11} matches any character (except for line terminators)
    {11} Quantifier — Matches exactly 11 times