我有一个拥有近200,000个网络地址类型的postgres数据库。 我想检测一些子网是否重叠,例如检测123.0.0.0/16,123.2.0.0/24和123.3.4.128/30并报告它们。
我已经使用了很多python脚本和netaddr库。
考虑到条目的数量,检测重叠的最佳方法/算法是什么?
我非常确定比将每个条目与整个数据库进行比较有更好的方法。
答案 0 :(得分:5)
我认为以下应该是一种相当有效的方法:
import netaddr
import bisect
def subnets_overlap(subnets):
# ranges will be a sorted list of alternating start and end addresses
ranges = []
for subnet in subnets:
# find indices to insert start and end addresses
first = bisect.bisect_left(ranges, subnet.first)
last = bisect.bisect_right(ranges, subnet.last)
# check the overlap conditions and return if one is met
if first != last or first % 2 == 1:
return True
ranges[first:first] = [subnet.first, subnet.last]
return False
示例:
>>> subnets_overlap([netaddr.IPNetwork('1.0.0.0/24'), netaddr.IPNetwork('1.0.0.252/30')])
True
>>> subnets_overlap([netaddr.IPNetwork('1.0.0.0/24'), netaddr.IPNetwork('1.0.1.0/24')])
False
答案 1 :(得分:1)
import sys
import ipaddr
from pprint import pprint
from netaddr import IPNetwork, IPAddress
matching_subent=[]
def cidrsOverlap(cidr0):
subnets_list = [IPNetwork('123.0.0.0/16'),
IPNetwork('123.2.0.0/24'),
IPNetwork('123.132.0.0/20'),
IPNetwork('123.142.0.0/20')]
flag = False
for subnet in subnets_list:
if (subnet.first <= cidr0.last and subnet.last >= cidr0.last):
matching_subent.append(subnet)
print "Matching subnets for given %s are %s" %(cidr0, matching_subent)
pprint(subnets_list)
cidrsOverlap(IPNetwork(sys.argv[1]))