查找特定的网址<a> through regex

时间:2019-03-04 20:39:50

标签: html regex

I need a specific regex pattern to find a URL in web pages in HTML

For example, I would like to search for this url: domainurl.com

and these are the URLs with tags

<a href="https://www.domainurl.com/refer/google-adsense/">fsdf</a>
<a title="Google Adsense" href="https://www.domainurl.com/refer/google-adsense/" target="_blank" rel="nofollow noopener">fgddf</a>
<a href="https://www.domainurl.com/page/pago">domain </a>

using this code regex

<a.*?[^>]* href="((https?:\/\/)?([\w\-])+\.{1}domainurl\.([a-z]{2,6})([\/\w\.-]*)*\/?)"

what congra get to get this label , I suppose to have target = "_ blank" rel = "nofollow noopener"

<a title="Google Adsense" href="https://www.domainurl.com/refer/google-adsense/" target="_blank" rel="nofollow noopener">fgddf</a>

Is there any regex code for target = "_ blank" and rel = "nofollow noopener" ??

this is what I have https://regexr.com/49hne

1 个答案:

答案 0 :(得分:1)

有关使用positive lookbehind的完整URL:

(?<=\<a.*?href=\")(.*?\..*?\.[a-z]+)

DEMO

domainurl.com使用positive lookbehind

(?<=\<a.*?www\.)([a-z]+\.[a-z]+)

DEMO2

对于target = "_ blank" and rel = "nofollow noopener"

DEMO3

target.*?\".*\"

对于domainurl.comtarget = "_ blank" and rel = "nofollow noopener"

DEMO4