Python中确定两个范围中哪些值重叠的最佳方法是什么?
例如:
x = range(1,10)
y = range(8,20)
(The answer I am looking for would be the integers 8 and 9.)
给定范围x,迭代另一个范围的最佳方法是什么,y并输出两个范围共享的所有值?在此先感谢您的帮助。
修改
作为后续行动,我意识到我还需要知道x是否与y重叠。我正在寻找一种方法来迭代一系列范围,并做一些重叠范围的额外事情。是否有一个简单的True / False语句来实现这个目标?
答案 0 :(得分:58)
如果步骤始终为+1(这是范围的默认值),则以下应该比将每个列表转换为集合或迭代任一列表更有效:
range(max(x[0], y[0]), min(x[-1], y[-1])+1)
答案 1 :(得分:43)
尝试使用set intersection:
>>> x = range(1,10)
>>> y = range(8,20)
>>> xs = set(x)
>>> xs.intersection(y)
set([8, 9])
请注意intersection
接受任何iterable作为参数(y
不需要转换为 set 进行操作)。
有一个等同于intersection
方法的运算符:&
但在这种情况下,它是requires both arguments to be sets。
答案 2 :(得分:13)
您可以使用set,但请注意set(list)
会从list
删除所有重复的条目:
>>> x = range(1,10)
>>> y = range(8,20)
>>> list(set(x) & set(y))
[8, 9]
答案 3 :(得分:9)
一种选择是使用列表理解,如:
x = range(1,10)
y = range(8,20)
z = [i for i in x if i in y]
print z
答案 4 :(得分:4)
对于“if x是否与y重叠”:
for a,b,c,d in ((1,10,10,14),
(1,10,9,14),
(1,10,4,14),
(1,10,4,10),
(1,10,4,9),
(1,10,4,7),
(1,10,1,7),
(1,10,-3,7),
(1,10,-3,2),
(1,10,-3,1),
(1,10,-11,-5)):
x = range(a,b)
y = range(c,d)
print 'x==',x
print 'y==',y
b = not ((x[-1]<y[0]) or (y[-1]<x[0]))
print ' x %s y' % ("does not overlap"," OVERLAPS ")[b]
print
结果
x== [1, 2, 3, 4, 5, 6, 7, 8, 9]
y== [10, 11, 12, 13]
x does not overlap y
x== [1, 2, 3, 4, 5, 6, 7, 8, 9]
y== [9, 10, 11, 12, 13]
x OVERLAPS y
x== [1, 2, 3, 4, 5, 6, 7, 8, 9]
y== [4, 5, 6, 7, 8, 9, 10, 11, 12, 13]
x OVERLAPS y
x== [1, 2, 3, 4, 5, 6, 7, 8, 9]
y== [4, 5, 6, 7, 8, 9]
x OVERLAPS y
x== [1, 2, 3, 4, 5, 6, 7, 8, 9]
y== [4, 5, 6, 7, 8]
x OVERLAPS y
x== [1, 2, 3, 4, 5, 6, 7, 8, 9]
y== [4, 5, 6]
x OVERLAPS y
x== [1, 2, 3, 4, 5, 6, 7, 8, 9]
y== [1, 2, 3, 4, 5, 6]
x OVERLAPS y
x== [1, 2, 3, 4, 5, 6, 7, 8, 9]
y== [-3, -2, -1, 0, 1, 2, 3, 4, 5, 6]
x OVERLAPS y
x== [1, 2, 3, 4, 5, 6, 7, 8, 9]
y== [-3, -2, -1, 0, 1]
x OVERLAPS y
x== [1, 2, 3, 4, 5, 6, 7, 8, 9]
y== [-3, -2, -1, 0]
x does not overlap y
x== [1, 2, 3, 4, 5, 6, 7, 8, 9]
y== [-11, -10, -9, -8, -7, -6]
x does not overlap y
速度比较:
from time import clock
x = range(-12,15)
y = range(-5,3)
te = clock()
for i in xrange(100000):
w = set(x).intersection(y)
print ' set(x).intersection(y)',clock()-te
te = clock()
for i in xrange(100000):
w = range(max(x[0], y[0]), min(x[-1], y[-1])+1)
print 'range(max(x[0], y[0]), min(x[-1], y[-1])+1)',clock()-te
结果
set(x).intersection(y) 0.951059981087
range(max(x[0], y[0]), min(x[-1], y[-1])+1) 0.377761978129
这些执行次数的比率为2.5
答案 5 :(得分:1)
如果要查找具有任意步骤的范围重叠,可以使用我提供的包https://github.com/avnr/rangeplus 与Python range()兼容的Range()类,以及包括交叉点在内的一些好东西:
>>> from rangeplus import Range
>>> Range(1, 100, 3) & Range(2, 100, 4)
Range(10, 100, 12)
>>> Range(200, -200, -7) & range(5, 80, 2) # can intersect with Python range() too
Range(67, 4, -14)
Range()也可以是未绑定的(当stop为None时,Range继续到+/-无穷大):
>>> Range(1, None, 3) & Range(3, None, 4)
Range(7, None, 12)
>>> Range(253, None, -3) & Range(208, 310, 5)
Range(253, 207, -15)
计算交集,而不是迭代,这使得实现的效率独立于Range()的长度。
答案 6 :(得分:1)
如果您要查找两个实值有界区间之间的重叠,那么这很好:
def overlap(start1, end1, start2, end2):
"""how much does the range (start1, end1) overlap with (start2, end2)"""
return max(max((end2-start1), 0) - max((end2-end1), 0) - max((start2-start1), 0), 0)
我在任何地方都无法在线找到它,所以我想到了这个问题,并在这里发布。
答案 7 :(得分:1)
这是步长为1的情况下(99%的时间)的简单范围的答案,当使用集比较长距离时,这可以快2500倍(如基准测试所示)只是想知道是否有重叠):
x = range(1,10)
y = range(8,20)
def range_overlapping(x, y):
if x.start == x.stop or y.start == y.stop:
return False
return ((x.start < y.stop and x.stop > y.start) or
(x.stop > y.start and y.stop > x.start))
>>> range_overlapping(x, y)
True
要查找重叠值:
def overlap(x, y):
if not range_overlapping(x, y):
return set()
return set(range(max(x.start, y.start), min(x.stop, y.stop)+1))
视觉帮助:
| | | |
| | | |
基准:
x = range(1,10)
y = range(8,20)
In [151]: %timeit set(x).intersection(y)
2.74 µs ± 11.2 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [152]: %timeit range_overlapping(x, y)
1.4 µs ± 2.91 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
结论:即使在较小范围内,速度也要快两倍。
x = range(1,10000)
y = range(50000, 500000)
In [155]: %timeit set(x).intersection(y)
43.1 ms ± 158 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
In [156]: %timeit range_overlapping(x, y)
1.75 µs ± 88.3 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
结论:您想在这种情况下使用range_overlapping功能,因为它快2500倍(我的个人记录处于加速状态)
答案 8 :(得分:1)
以上答案似乎过于复杂。这一衬板在Python3中完美工作,将范围作为输入和输出。它还处理非法范围。要获得这些值,请遍历结果(如果没有)。
# return overlap range for two range objects or None if no ovelap
# does not handle step!=1
def range_intersect(r1, r2):
return range(max(r1.start,r2.start), min(r1.stop,r2.stop)) or None
答案 9 :(得分:0)
假设您专门使用范围,步长为def range_intersect(range_x,range_y):
if len(range_x) == 0 or len(range_y) == 0:
return []
# find the endpoints
x = (range_x[0], range_x[-1]) # from the first element to the last, inclusive
y = (range_y[0], range_y[-1])
# ensure min is before max
# this can be excluded if the ranges must always be increasing
x = tuple(sorted(x))
y = tuple(sorted(y))
# the range of the intersection is guaranteed to be from the maximum of the min values to the minimum of the max values, inclusive
z = (max(x[0],y[0]),min(x[1],y[1]))
if z[0] < z[1]:
return range(z[0], z[1] + 1) # to make this an inclusive range
else:
return [] # no intersection
,您可以使用数学快速完成。
Placemark
在一对范围内,每个范围超过10 ^ 7个元素,这需要不到一秒钟,与重叠的元素数量无关。我尝试了10 ^ 8左右的元素,但我的计算机冻结了一段时间。我怀疑你是否会长期使用列表。
答案 10 :(得分:0)
此解决方案生成的整数位于 range
内存中任意数量的 O(1)
对象的交集。
披露:我从 Python Chat 的一个用户那里得到了这个,在我尝试了其他东西之后......不太优雅。
def range_intersection(*ranges):
ranges = set(ranges) # `range` is hashable so we can easily eliminate duplicates
if not ranges: return
shortest_range = min(ranges, key=len) # we will iterate over one, so choose the shortest one
ranges.remove(shortest_range) # note: `range` has a length, so we can use `len`
for i in shortest_range:
if all(i in range_ for range_ in ranges): yield i # Finally, `range` implements `__contains__`
# by checking if an iteger satisfies it's simple formula
OP 的问题
x = range(1,10)
y = range(8,20)
list(range_intersection(x, y))
[8, 9]
我的例子
limit = 10_000
list(range_intersection(
range(2, limit, 2),
range(3, limit, 3),
range(5, limit, 5),
range(41, limit, 41),
))
[1230, 2460, 3690, 4920, 6150, 7380, 8610, 9840]