所以我有一个索引列表,
[0, 1, 2, 3, 5, 7, 8, 10]
并希望将其转换为此内容,
[[0, 3], [5], [7, 8], [10]]
这将在大量指数上运行。
另外,这在技术上不适用于python中的切片,与给定单个ID时相比,给定范围时,我使用的工具更快。
模式基于在一个范围内,就像切片在python中工作一样。因此在示例中,1和2被删除,因为它们已经包含在0到3的范围内.5需要单独访问,因为它不在范围内等等。当大量id时,这会更有用包括在[0,5000]等范围内。
答案 0 :(得分:6)
由于您希望代码快速,我不会试图过于花哨。一个直接的方法应该表现得很好:
a = [0, 1, 2, 3, 5, 7, 8, 10]
it = iter(a)
start = next(it)
slices = []
for i, x in enumerate(it):
if x - a[i] != 1:
end = a[i]
if start == end:
slices.append([start])
else:
slices.append([start, end])
start = x
if a[-1] == start:
slices.append([start])
else:
slices.append([start, a[-1]])
不可否认,这看起来并不太好,但我希望我能想到的更好的解决方案会让情况变得更糟。 (我没有做基准测试。)
这是一个稍微好一些,但速度较慢的解决方案:
from itertools import groupby
a = [0, 1, 2, 3, 5, 7, 8, 10]
slices = []
for key, it in groupby(enumerate(a), lambda x: x[1] - x[0]):
indices = [y for x, y in it]
if len(indices) == 1:
slices.append([indices[0]])
else:
slices.append([indices[0], indices[-1]])
答案 1 :(得分:3)
def runs(seq):
previous = None
start = None
for value in itertools.chain(seq, [None]):
if start is None:
start = value
if previous is not None and value != previous + 1:
if start == previous:
yield [previous]
else:
yield [start, previous]
start = value
previous = value
答案 2 :(得分:1)
由于性能是一个问题,请使用第一个solution by @SvenMarnach,但这里有一个有趣的一行分为两行! :d
>>> from itertools import groupby, count
>>> indices = [0, 1, 2, 3, 5, 7, 8, 10]
>>> [[next(v)] + list(v)[-1:]
for k,v in groupby(indices, lambda x,c=count(): x-next(c))]
[[0, 3], [5], [7, 8], [10]]
答案 3 :(得分:0)
下面是一个简单的python代码,有numpy:
def list_to_slices(inputlist):
"""
Convert a flatten list to a list of slices:
test = [0,2,3,4,5,6,12,99,100,101,102,13,14,18,19,20,25]
list_to_slices(test)
-> [(0, 0), (2, 6), (12, 14), (18, 20), (25, 25), (99, 102)]
"""
inputlist.sort()
pointers = numpy.where(numpy.diff(inputlist) > 1)[0]
pointers = zip(numpy.r_[0, pointers+1], numpy.r_[pointers, len(inputlist)-1])
slices = [(inputlist[i], inputlist[j]) for i, j in pointers]
return slices