我使用下一个正则表达式(linkify regex的更新版本)来匹配链接并且不匹配电子邮件。
(\s*|[^a-zA-Z0-9.\+_\/"\>\-]|^)(?:([a-zA-Z0-9\+_\-]+(?:\.[a-zA-Z0-9\+_\-]+)*@)?(http:\/\/|https:\/\/|ftp:\/\/|scp:\/\/){1}?((?:(?:[a-zA-Z0-9][a-zA-Z0-9_%\-_+]*\.)+))(?:[a-zA-Z]{2,})((?::\d{1,5}))?((?:[\/|\?](?:[\-a-zA-Z0-9_%#*&+=~!?,;:.\/]*)*)[\-\/a-zA-Z0-9_%#*&+=~]|\/?)?)([^a-zA-Z0-9\+_\/"\<\-]|$)
但是这个正则表达式找不到像https://someurl:3333/view/something
你能帮我解决这个问题吗?谢谢!
答案 0 :(得分:1)
这应该是表达式的“最少修改”版本,以匹配没有顶级的域:
(\s*|[^a-zA-Z0-9.\+_\/"\>\-]|^)(?:([a-zA-Z0-9\+_\-]+(?:\.[a-zA-Z0-9\+_\-]+)*@)?(http:\/\/|https:\/\/|ftp:\/\/|scp:\/\/){1}?((?:[a-zA-Z0-9][a-zA-Z0-9_%\-_+.]*)(?:\.[a-zA-Z]{2,})?)((?::\d{1,5}))?((?:[\/|\?](?:[\-a-zA-Z0-9_%#*&+=~!?,;:.\/]*)*)[\-\/a-zA-Z0-9_%#*&+=~]|\/?)?)([^a-zA-Z0-9\+_\/"\<\-]|$)
更改的部分是捕获组3,抓取域的那个。它来自:
(
(?:
(?:
[a-zA-Z0-9]
[a-zA-Z0-9_%\-_+]*
\.
)+ (?# this is how they repeated for optional subdomains)
)
)
(?:
[a-zA-Z]{2,} (?# here is the mandatory TLD)
)
对此:
(
(?:
[a-zA-Z0-9]
[a-zA-Z0-9_%\-_+.]* (?# the . is in the character class here for subdomains)
)
(?:
\.
[a-zA-Z]{2,}
)? (?# this TLD is optional)
)