我想从可能不同的网址中提取视频ID
https://www.facebook.com/{page-name}/videos/{video-id}/
https://www.facebook.com/{username}/videos/{video-id}/
https://www.facebook.com/video.php?id={video-id}
https://www.facebook.com/video.php?v={video-id}
如何使用单个ruby正则表达式检索视频ID?
我还没有设法将它转换为Ruby正则表达式,但我(部分)设法用标准JS正则表达式编写它:
^(https?://www\.facebook\.com/(?:video\.php\?v=\d+|.*?/videos/\d+))$
当我在Ruby中运行以下代码时,它给出了一个错误:
text = "https://www.facebook.com/pili.morillo.56/videos/352355988613922/"
id = text.gsub( ^(https?://www\.facebook\.com/(?:video\.php\?v=\d+|.*?/videos/\d+))$ )
答案 0 :(得分:1)
以下是我提出的正则表达式:/(?<=\/videos\/)\d+?(?=\/|$)|(?<=[?&]id=)\d+?(?=&|$)|(?<=[?&]v=)\d+?(?=&|$)/
打破这一局,我们可以得到这个:
(?<=\/videos\/)\d+(?=\/|$)|
(?<=[?&]id=)\d+(?=&|$)|
(?<=[?&]v=)\d+(?=&|$)
三个选项中的每一个都遵循以下简单结构:(?<=beforeMatch)target(?=afterMatch)
。
以下是第一个例子:
(?<=\/videos\/) # Positive lookbehind
\d+ # Matching the digits
(?=\/|$) # Positive lookahead
所以,这意味着,匹配\d+
任何数字,只要它在\/videos\/
之后,然后是\/
,或者它的结尾是(?<=\/videos\/) # Match as long as preceeded by '\/videos\/'
\d+ # Matching the id digits
(?=\/|$) # As long as it's followed by '\/' or the EOL
| # Or
(?<=[?&]id=) # Match as long as preceeded by '?id' or '&id'
\d+ # Matching the id digits
(?=&|$) # As long as it's followed by either '&' or the EOL
| # Or
(?<=[?&]v=) # Match as long as preceeded by '?v' or '&v'
\d+ # Matching the id digits
(?=&|$) # As long as it's followed by either '&' or the EOL
线。
因此,我们可以匹配&#39; id =&#39;,&#39; v =&#39;或者&#39;视频/&#39;。
完整的解释:
([^,]+),?
EOL&#39;意味着行尾。
答案 1 :(得分:0)
RE = %r[https://www.facebook.com/(?:.+?/)?video(?:.*?[/=])(.+?)(?:/?\z)]
%w[
https://www.facebook.com/{page-name}/videos/{video-id}/
https://www.facebook.com/{username}/videos/{video-id}/
https://www.facebook.com/video.php?id={video-id}
https://www.facebook.com/video.php?v={video-id}
].map { |url| url[RE, 1] }
#⇒ ["{video-id}", "{video-id}", "{video-id}", "{video-id}"]
答案 2 :(得分:0)
您可以使用:
^https?:\/\/www\.facebook\.com\/.*?video(?:s|\.php.*?[?&](?:id|v)=)\/?([^\/&\n]+).*$
匹配
字符串的开头并开始url
^https?:\/\/www\.facebook\.com\/
其次是:
.*? # Match any character zero or more times video # Match video (?: # Non capturing group s # Match s | # Or \.php # Match .php .*? # Match any character zero or more times [?&] # Match ? or & (?:id|v)= # Match id or v in non capturing group followed by = ) # Close non capturing group \/? # Match optional / ( # Capturing group (group 1) [^\/&\n]+ # Match not / or & or newline ) # Close capturing group .* # Match any character zero or more times $ # End of the string
text = "https://www.facebook.com/pili.morillo.56/videos/352355988613922/"
id = text.gsub(/^https?:\/\/www\.facebook\.com\/.*?video(?:s|\.php.*?[?&](?:id|v)=)\/?([^\/&\n]+).*$/, "\\1")
puts id
这将导致:352355988613922