如何仅从文本正文中提取某些电子邮件地址有多少?

时间:2015-09-01 11:02:46

标签: regex

在网址编码的json中有许多与其角色一起发送的电子邮件地址。像这样:

hl=en_US&token=AFNOsBbXXvng6zJmmPyIlya1dT48RKqmaQ%3A1441100947178&foreignService=explorer&shareService=explorer&authuser=0&locale=en_US&requestType=aclChange&itemIds=0B-i4kCZeNb05Y3FrVXFLYU41N0U&confirmed=false&modelChanges=%7B%22aclEntries%22%3A%5B%7B%22scope%22%3A%7B%22scopeType%22%3A%22user%22%2C%22name%22%3A%22babar.memon%40gmail.com%22%2C%22id%22%3A%22112542596153041291285%22%2C%22me%22%3Afalse%2C%22requiresKey%22%3Afalse%2C%22email%22%3A%22babar.memon%40gmail.co%22%7D%2C%22role%22%3A30%7D%2C%7B%22scope%22%3A%7B%22iconUrl%22%3A%22%2Fc%2Fu%2F0%2Fphotos%2Fpublic%2FAIbEiAIAAABDCL_k77OCsqvJPSILdmNhcmRfcGhvdG8qKGM0MmEwMjBkZWQ0MDAzMzMwYjI2MjczZmNlZWVlMDA3NDUxMGI2N2MwAdau5OHbez_zFcRyTELkBcRF-Lv9%22%2C%22scopeType%22%3A%22user%22%2C%22name%22%3A%22Saad%20Rehman%22%2C%22id%22%3A%22104436799417545912895%22%2C%22me%22%3Afalse%2C%22requiresKey%22%3Afalse%2C%22email%22%3A%22this.saad%40gmail.com%22%7D%2C%22role%22%3A20%7D%2C%7B%22scope%22%3A%7B%22iconUrl%22%3A%22%2Fc%2Fu%2F0%2Fphotos%2Fpublic%2FAIbEiAIAAABDCL_k77OCsqvJPSILdmNhcmRfcGhvdG8qKGM0MmEwMjBkZWQ0MDAzMzMwYjI2MjczZmNlZWVlMDA3NDUxMGI2N2MwAdau5OHbez_zFcRyTELkBcRF-Lv9%22%2C%22scopeType%22%3A%22user%22%2C%22name%22%3A%22Saad%20Rehman%22%2C%22id%22%3A%22104436799417545912895%22%2C%22me%22%3Afalse%2C%22requiresKey%22%3Afalse%2C%22email%22%3A%22this.saad%40gmail.com%22%7D%2C%22role%22%3A60%7D%2C%7B%22scope%22%3A%7B%22scopeType%22%3A%22user%22%2C%22name%22%3A%22Asim%20Kazmi%22%2C%22id%22%3A%22118161687853857289891%22%2C%22me%22%3Afalse%2C%22requiresKey%22%3Afalse%2C%22email%22%3A%22asim.kazmi%40elasticaqa.info%22%7D%2C%22role%22%3A20%7D%2C%7B%22scope%22%3A%7B%22scopeType%22%3A%22user%22%2C%22name%22%3A%22Asim%20Kazmi%22%2C%22id%22%3A%22118161687853857289891%22%2C%22me%22%3Afalse%2C%22requiresKey%22%3Afalse%2C%22email%22%3A%22asim.kazmi%40elasticaqa.info%22%7D%2C%22role%22%3A60%7D%5D%7D

如果我只想要在其旁边有角色30的电子邮件地址,或者在另一个规则中,所有在其旁边都有角色20的电子邮件地址,该怎么办?

这是我到目前为止所做的: `

.*?email.22.3A.22([a-zA-Z0-9_.+%-]+?%40[a-zA-Z0-9_%-]+?[.][a-zA-Z0-9_.%-]+?).22[^r]+role.22.3A30.7D

这应该给我所有在他们旁边有角色30的电子邮件地址,即babar.memon%40elastica.com。如果我将.0替换为30,那么我会获得所有电子邮件地址,就像我想要的那样,除了我想要它们,首先是所有具有角色30的角色,然后是角色20等

此正则表达式可在此处https://regex101.com/r/rW0qO9/1

中找到

2 个答案:

答案 0 :(得分:1)

正则表达式可用于从字符串中提取模式,您无法按特定顺序提取它们,因为每个匹配都是样本字符串的子字符串。您必须收集这些匹配并在后期订购它们。

此外,您可以动态构建正则表达式,方法是使用(.0)部分(在正则表达式中)使用2030以及40参数化变量并逐一提取每一个。

答案 1 :(得分:0)

试试这个正则表达式 .*?email.22.3A.22([a-zA-Z0-9_.+-]+?%40[a-zA-Z0-9_%-]+?[.][a-zA-Z0-9_.-]+).22[^r]+role.22.3A(20).7D ,我稍微改变了捕获组,因为在电子邮件地址中我们不能有像%这样的特殊字符,所以它只会出现在' @'编码为%40