我做了这个练习:
拥有这些链接
1. http://example.com/cat1/subcat3/subcat4/tag/this%20is%20page/asdasda?start=130
2. http://example.com/cat1/subcat3/subcat4/tag/this%20is%20pageasdasd
3. example.it/news/tag/this%is%20n%page?adsadsadasd
4. http://example.com/tag/thispage/asdasdasd.-?asds=
5. http://example.com/tag/this%20is%20page/asdasd
6. /tag/this/asdasdasd
7. /tag/asd-asd/feed/this-feed
8. /tag/sd-asd
http://example.com/tag/this%20is%20page
http://example.com/tag/this%20is%20pageasdasd
example.it/tag/this%is%20n%page
http://example.com/tag/thispage
http://example.com/tag/this%20is%20page
/tag/this
/tag/asd-asd
但正则表达式必须考虑第八 。域名也是如此。
我试图成功:https://regex101.com/r/aB5mPn/5但我不能不考虑最后一个案例。
任何人都可以帮助我?
答案 0 :(得分:2)
如果我没有弄错,你可以在匹配/tag...etc之前添加一个否定的预测,以断言8个案例后面的内容不是/ tag / sd-asd直到字符串{{1}的结尾}
你的正则表达式可能如下:
(?:(?:\/[A-Za-z0-9-]+)?)+(?!\/tag\/[^\/]+$)(\/tag\/[A-Za-z0-9-%]+)(.*)