我们需要在C中编写一个电子邮件验证程序。我们计划使用GNU Cregex.h)正则表达式。
我们准备的正则表达式是
[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*@(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?
但是下面的代码在编译正则表达式时失败了。
#include <stdio.h>
#include <regex.h>
int main(const char *argv, int argc)
{
const char *reg_exp = "[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*@(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?";
int status = 1;
char email[71];
regex_t preg;
int rc;
printf("The regex = %s\n", reg_exp);
rc = regcomp(&preg, reg_exp, REG_EXTENDED|REG_NOSUB);
if (rc != 0)
{
if (rc == REG_BADPAT || rc == REG_ECOLLATE)
fprintf(stderr, "Bad Regex/Collate\n");
if (rc == REG_ECTYPE)
fprintf(stderr, "Invalid Char\n");
if (rc == REG_EESCAPE)
fprintf(stderr, "Trailing \\\n");
if (rc == REG_ESUBREG || rc == REG_EBRACK)
fprintf(stderr, "Invalid number/[] error\n");
if (rc == REG_EPAREN || rc == REG_EBRACE)
fprintf(stderr, "Paren/Bracket error\n");
if (rc == REG_BADBR || rc == REG_ERANGE)
fprintf(stderr, "{} content invalid/Invalid endpoint\n");
if (rc == REG_ESPACE)
fprintf(stderr, "Memory error\n");
if (rc == REG_BADRPT)
fprintf(stderr, "Invalid regex\n");
fprintf(stderr, "%s: Failed to compile the regular expression:%d\n", __func__, rc);
return 1;
}
while (status)
{
fgets(email, sizeof(email), stdin);
status = email[0]-48;
rc = regexec(&preg, email, (size_t)0, NULL, 0);
if (rc == 0)
{
fprintf(stderr, "%s: The regular expression is a match\n", __func__);
}
else
{
fprintf(stderr, "%s: The regular expression is not a match: %d\n", __func__, rc);
}
}
regfree(&preg);
return 0;
}
正则表达式编译失败,出现以下错误。
The regex = [a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*@(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?
Invalid regex
main: Failed to compile the regular expression:13
此错误的原因是什么?正则表达式是否需要修改?
谢谢, Mathew Liju
答案 0 :(得分:2)
如果您有兴趣,
我最近看到Perfect email regex finally found上的Hacker News帖子和 它是关于Comparing E-mail Address Validating Regular Expressions。
正则表达式,
// James Watts and Francisco Jose Martin Moreno are the first to develop one which
// passes all of the tests.
/^([\w\!\#$\%\&\'\*\+\-\/\=\?\^\`{\|\}\~]+\.)*[\w\!\#$\%\&\'\*\+\-\/\=\?\^\`{\|\}\~]+@((((([a-z0-9]{1}[a-z0-9\-]{0,62}[a-z0-9]{1})|[a-z])\.)+[a-z]{2,6})|(\d{1,3}\.){3}\d{1,3}(\:\d{1,5})?)$/i
// Arluison Guillaume has also improved Warren Gaebel's regex.
// This one will work in JavaScript:
/^[-a-z0-9~!$%^&*_=+}{\'?]+(\.[-a-z0-9~!$%^&*_=+}{\'?]+)*@([a-z0-9_][-a-z0-9_]*(\.[-a-z0-9_]+)*\.(aero|arpa|biz|com|coop|edu|gov|info|int|mil|museum|name|net|org|pro|travel|mobi|[a-z][a-z])|([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}))(:[0-9]{1,5})?$/i
答案 1 :(得分:1)
您的问题是序列(?
的四个实例。这没有意义 - (
启动一个新的子正则表达式,并且在正则表达式的开头你不能有?
。
答案 2 :(得分:1)
我在标准C中使用此POSIX表达式:
const char *reg_exp = "^([a-z0-9])(([-a-z0-9._])*([a-z0-9]))*@([a-z0-9])"
"(([a-z0-9-])*([a-z0-9]))+(.([a-z0-9])([-a-z0-9_-])?"
"([a-z0-9])+)+$";