我正在尝试只为域名中最左边的通配符编写一个正则表达式。到目前为止,我有这个:
import re
o = urlparse(argv[1])
host_name = o.netloc
context = SSL.Context(SSL.TLSv1_METHOD) # Use TLS Method
context.set_options(SSL.OP_NO_SSLv2) # Don't accept SSLv2
context.set_verify(SSL.VERIFY_PEER | SSL.VERIFY_FAIL_IF_NO_PEER_CERT,
callback)
# context.load_verify_locations(ca_file, ca_path)
sock = socket()
ssl_sock = SSL.Connection(context, sock)
ssl_sock.connect((host_name, 443))
ssl_sock.set_connect_state()
ssl_sock.set_tlsext_host_name(host_name)
ssl_sock.do_handshake()
cert = ssl_sock.get_peer_certificate()
common_name = cert.get_subject().commonName.decode()
print "Common Name: ", common_name
print "Cert number: ", cert.get_serial_number()
regex = common_name.replace('.', r'\.').replace('*',r'.*') + '$'
if re.match(regex, host_name):
print "matches"
else:
print "invalid"
# output:
Common Name: *.example.com
Cert number: 63694395280496902491340707875731768741
但是,正则表达式不仅匹配*.example.com
,还匹配*.*.*
或www.*.com
。此外,不应允许https://wrong.host.example.com/
匹配。如何确保它只匹配最左边的标签?
答案 0 :(得分:0)
您可以使用urlparse和split而不是正则表达式。
from urlparse import urlparse
.
.
common_name = cert.get_subject().commonName.decode()
domain = urlparse(common_name).netloc
host = domain.split('.',1)[0]
答案 1 :(得分:0)
你可以试试这个正则表达式:
r'(?:^|\s)(\w+\.)?example\.com(?:$|\s)'
完整演示:
sock = socket()
ssl_sock = SSL.Connection(context, sock)
ssl_sock.connect((host_name, 443))
ssl_sock.set_connect_state()
ssl_sock.set_tlsext_host_name(host_name)
ssl_sock.do_handshake()
cert = ssl_sock.get_peer_certificate()
common_name = cert.get_subject().commonName.decode()
print "Common Name: ", common_name
print "Cert number: ", cert.get_serial_number()
rxString = r'(?:^|\s)(\w+\.)?' + common_name.replace('.', '\.')[3:] + '(?:$|\s)'
regex = re.compile(rxString)
if regex.match(host_name):
print "matches"
else:
print "invalid"
输入:
url
-------------------
www.example.com
example.com
hello.example.com
foo.bar.example.com
*.*.*
www.*.com
输出:
url | result
------------------- | -----------
www.example.com | matches
example.com | matches
hello.example.com | matches
foo.bar.example.com | invalid
*.*.* | invalid
www.*.com | invalid
答案 2 :(得分:0)
不幸的是,在Saleem的回答中,Regexp是错误的,不符合RFC6125 [6.4.3]。
我认为,最好的方法是将'*'字符更改为'[^。] +'(或'[^。] *'-RFC不干净,如果f.example.com与f *匹配。 example.com)
rxString = '^'+common_name.replace('.','\.').replace('*','[^\.]+')+'$'