Question

我正在尝试使用Ruby解析URL，并在＆＃34; /＆＃34;之后返回与单词匹配的URL。在.com，.org等。

如果我想捕捉＆＃34;问题＆＃34;在诸如的URL中 https://stackoverflow.com/questions我也希望能够捕获https://stackoverflow.com/blah/questions。但我不想捕获https://stackoverflow.com/queStioNs。

目前，我的表达式可以与https://stackoverflow.com/questions匹配，但不能与＆＃34;问题＆＃34;匹配接着是＆＃34; /＆＃34;，或2＆＃34; /＆＃34; s等。

正则表达式的结尾是使用\bquestions\。

我尝试过([a-zA-Z]+\W{1}+\bjob\b|\bjob\b)，但这只能获得/questions和/blah/questions但不是/blah/bleh/questions的网址。

我做错了什么，我如何匹配我需要的东西？

Answer 1

您实际上并不需要正则表达式，而是可以使用URI module：

require 'uri'

urls = ['https://stackoverflow.com/blah/questions', 'https://stackoverflow.com/queStioNs']

urls.each do |url|
    the_path = URI(url).path
    puts the_path if the_path.include?'questions' 
end

Answer 2

我不知道是否有任何简单的方法，这是我的解决方案：

regexp = '^(https|http)?:\/\/[\w]+\.(com|org|edu)(\/{1}[a-z]+)*$'
group_length = "https://stackoverflow.com/blah/questions".match(regexp).length
"https://stackoverflow.com/blah/questions".match(regexp)[group_length - 1].gsub("/","")

它将返回'questions'。

根据您的评论更新：

使用[\S]*(\/questions){1}$

希望有所帮助：）

正则表达式在URL中找到“/”后面的单词用法

2 个答案: