Question

我在java中寻找好的正则表达式，从所有链接和所有电子邮件中获取字符串网址。现在我有链接的正则表达式：

 String linkRegex = "http[s]*://(\\w+\\.)*(\\w+)";
   Pattern pattern = Pattern.compile(linkRegex);

   Matcher matcher = pattern.matcher(stringAddres);

        while (matcher.find()) {
            String currentLink = matcher.group();
}

我收到的链接如：http://twitter.com但我也有https://google。那么有什么方法可以删除像https://google这样的链接吗？

我需要正则表达式，它从字符串给我电子邮件，例如：从这个：

href="mailto:contact@example.com">contact@example.com</a></span>

我应该只联系contact@example.com

Answer 1

有许多回答问题，简单的正则表达式模式适用于大多数常见的邮件，我仍然会建议这个基于RFC 5322标准的正则表达式：

（在[a-Z0-9＃$％＆安培;＆＃39; + / = ^ _`{|}〜！ - ？] +（？：[A-Z0-9？！＃$％＆安培;＆＃39; + / = ^ _`{|}〜 - ] +） |＆＃34;（？：[\ x01- \ X08 \ X0B \ X0C \？ x0e- \ X1F \ X21 \ x23- \ x5b \ x5d- \ 0x7F部分] | \ [\ x01- \ X09 \ X0B \ X0C \ x0e- \ 0x7F部分]）＆＃34）@（:(？？：？？？？一个-Z0-9）+ A-Z0-9 | [（:( ?: 25 [0-5] | 2 [0-4] [0-9] | [01] [0 -9] [0-9]））{3}（?: 25 [0-5] | 2 [0-4] [0-9] | [01] [0-9] [0-9 ？] | [A-Z0-9 - ] * [A-Z0-9]：（？：[\ x01- \ X08 \ X0B \ X0C \ x0e- \ X1F \ x21- \ X5A \ x53- \ 0x7F部分] | \ [\ x01- \ X09 \ X0B \ X0C \ x0e- \ 0x7F部分]）+）]）

从this site复制。

Answer 2

我只是使用look-behind来锁定文本中有趣的属性，然后只捕获“...”中的所有内容。

喜欢这个

((?<=href="mailto:)|(?<=src="))[^"]+

从网址获取链接并通过正则表达式获取电子邮件

2 个答案: