Question

这是我的一个面试问题。我没有提出足够好的解决方案而遭到拒绝。

问题是

What is the one regex to match all urls that contain job(case insensitive) in the relative   
path(not domain) in the following list:

    - http://www.glassdoor.com/job/ABC
    - https://glassdoor.com/job/
    - HTTPs://job.com/test
    - Www.glassdoor.com/foo/bar/joBs
    - http://192.168.1.1/ABC/job
    - http://bankers.jobs/ABC/job

我的解决方案是使用lookahead和lookbehind， /(?<!\.)job(?!\.)/i 。这在上面的列表中工作正常。但是，如果网址为HTTPs://jobs.com/test，则无效。

我想知道这个问题的正确答案是什么。提前感谢任何建议！

Answer 1

试试这个正则表达式：

/\b(?:https?:\/\/)?[^\/:]+\/.*?job/gmi

在线演示：http://regex101.com/r/rV3oP8

Answer 2

如果您不需要验证网址，请关注“工作”

 #  /(?i)(?<=\/)job(?=\/|[^\S\r\n]*$)/

 (?i)
 (?<= / )
 job
 (?= / | [^\S\r\n]* $ )

Answer 3

这是我提出的一个：

^(?:.*://)?(?:[wW]{3}\.)?([^:/])*/.*job.*

它匹配您的所有示例，但不匹配job.com或jobs.com。（工作只在路上。）

我在sublime文本中进行了测试，这是很好的b / c正则表达式在你输入时突出显示。

Answer 4

在采访中我也被问到这个问题，这是我的解决方案： /./+job/?./i它在Rubular.com上运行良好

正则表达式匹配包含不在域中的相对路径中的字符串的URL

4 个答案:

在线演示：http://regex101.com/r/rV3oP8