While the below C function does a good job to validate any combination of URL/FQDN but it fails to validate IPv4 addresses and Shorthand notation of IPv6 and certain other IPv6 format addresses.
Can the below regex be improvised to validate IPv4 addresses and IPv6 addresses?
int validateURLPhase2(char *url)
{
int status;
regex_t re;
char *regexp = "^((ftp|http|https)://)?([a-z0-9]([-a-z0-9]*[a-z0-9])?\\.)|([0-9].[0-9].[0-9].[0-9])|(([0-9a-fA-F]{1,4}:){7,7}[0-9a-fA-F]{1,4}|([0-9a-fA-F]{1,4}:){1,7}:|([0-9a-fA-F]{1,4}:){1,6}:[0-9a-fA-F]{1,4}|([0-9a-fA-F]{1,4}:){1,5}(:[0-9a-fA-F]{1,4}){1,2}|([0-9a-fA-F]{1,4}:){1,4}(:[0-9a-fA-F]{1,4}){1,3}|([0-9a-fA-F]{1,4}:){1,3}(:[0-9a-fA-F]{1,4}){1,4}|([0-9a-fA-F]{1,4}:){1,2}(:[0-9a-fA-F]{1,4}){1,5}|[0-9a-fA-F]{1,4}:((:[0-9a-fA-F]{1,4}){1,6})|:((:[0-9a-fA-F]{1,4}){1,7}|:)|fe80:(:[0-9a-fA-F]{0,4}){0,4}%[0-9a-zA-Z]{1,}|::(ffff(:0{1,4}){0,1}:){0,1}((25[0-5]|(2[0-4]|1{0,1}[0-9]){0,1}[0-9])\\.){3,3}(25[0-5]|(2[0-4]|1{0,1}[0-9]){0,1}[0-9])|([0-9a-fA-F]{1,4}:){1,4}:((25[0-5]|(2[0-4]|1{0,1}[0-9]){0,1}[0-9])\\.){3,3}(25[0-5]|(2[0-4]|1{0,1}[0-9]){0,1}[0-9]))+((a[cdefgilmnoqrstuwxz]|aero|arpa)|(b[abdefghijmnorstvwyz]|biz)|(c[acdfghiklmnorsuvxyz]|cat|com|coop)|d[ejkmoz]|(e[ceghrstu]|edu)|f[ijkmor]|(g[abdefghilmnpqrstuwy]|gov)|h[kmnrtu]|(i[delmnoqrst]|info|int)|(j[emop]|jobs)|k[eghimnprwyz]|l[abcikrstuvy]|(m[acdghklmnopqrstuvwxyz]|mil|mobi|museum)|(n[acefgilopruz]|name|net)|(om|org)|(p[aefghklmnrstwy]|pro)|qa|r[eouw]|s[abcdeghijklmnortvyz]|(t[cdfghjklmnoprtvwz]|travel)|u[agkmsyz]|v[aceginu]|w[fs]|y[etu]|z[amw])$";
if ( regcomp(&re, regexp, REG_EXTENDED|REG_NOSUB|REG_ICASE) != 0 )
{
printf( "Regex has invalidated FQDN 1\n");
return -1;
}
status = regexec(&re, url, (size_t) 0, NULL, 0);
regfree(&re);
if ( status != 0 )
{
printf("Regex has invalidated FQDN 2\n");
return -1;
}
return 0;
}
Valid URL format that ideally should be accepted but was failed: http://[2001::1]/abc Regex has invalidated FQDN 2 validation failed
Invalid URL format that ideally should be rejected but was success: http://10.192.1 validation success
Other cases passed: http://10.2.1.1/abc http://www.example.com/abc
答案 0 :(得分:0)
The part of your regexp that matches numeric addresses only allows a single digit in each component. It also doesn't escape the .
, so it's matching anything. It should be:
([0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3})
Note that this will allow invalid IPs like 123.456.789.0
. It just checks that each number is 1-3 digits, not that it's between 0
and 255
.