python列表中的不连续切片

时间:2011-04-27 14:41:15

标签: python performance list slice

我正在寻找一种实现这一目标的有效方法,我认为这是一种类似切片的操作:

>>> mylist = range(100)
>>>magicslicer(mylist, 10, 20)
[0,1,2,3,4,5,6,7,8,9,30,31,32,33,34,35,36,37,38,39,60,61,62,63......,97,98,99]

这个想法是:切片得到10个元素,然后跳过 20个元素,然后得到下一个10,然后跳过20个,依此类推。

我认为如果可能的话我不应该使用循环,因为使用切片的原因是(我猜)在一次操作中有效地进行“提取”。

感谢阅读。

9 个答案:

答案 0 :(得分:20)

itertools.compress(2.7 / 3.1中的新功能)很好地支持像这样的用例,特别是当与itertools.cycle结合使用时:

from itertools import cycle, compress
seq = range(100)
criteria = cycle([True]*10 + [False]*20) # Use whatever pattern you like
>>> list(compress(seq, criteria))
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99]

Python 2.7计时(相对于Sven的显式列表理解):

$ ./python -m timeit -s "a = range(100)" "[x for start in range(0, len(a), 30) for x in a[start:start+10]]"
100000 loops, best of 3: 4.96 usec per loop

$ ./python -m timeit -s "from itertools import cycle, compress" -s "a = range(100)" -s "criteria = cycle([True]*10 + [False]*20)" "list(compress(a, criteria))"
100000 loops, best of 3: 4.76 usec per loop

Python 3.2计时(也与Sven的显式列表理解有关):

$ ./python -m timeit -s "a = range(100)" "[x for start in range(0, len(a), 30) for x in a[start:start+10]]"
100000 loops, best of 3: 7.41 usec per loop

$ ./python -m timeit -s "from itertools import cycle, compress" -s "a = range(100)" -s "criteria = cycle([True]*10 + [False]*20)" "list(compress(a, criteria))"
100000 loops, best of 3: 4.78 usec per loop

可以看出,相对于2.7中的内联列表理解,它并没有太大的区别,但是通过避免隐式嵌套范围的开销,在3.2中有很大的帮助。

如果目标是迭代结果序列而不是将其转换为完全实现的列表,则在2.7中也可以看到类似的差异:

$ ./python -m timeit -s "a = range(100)" "for x in (x for start in range(0, len(a), 30) for x in a[start:start+10]): pass"
100000 loops, best of 3: 6.82 usec per loop
$ ./python -m timeit -s "from itertools import cycle, compress" -s "a = range(100)" -s "criteria = cycle([True]*10 + [False]*20)" "for x in compress(a, criteria): pass"
100000 loops, best of 3: 3.61 usec per loop

对于特别长的模式,可以使用类似chain(repeat(True, 10), repeat(False, 20))的表达式替换模式表达式中的列表,这样就不必在内存中完全创建它。

答案 1 :(得分:11)

也许最好的方法是直截了当的方法:

def magicslicer(seq, take, skip):
    return [x for start in range(0, len(seq), take + skip)
              for x in seq[start:start + take]]

我认为你不能避免循环。

修改:由于此标记为“效果”,因此请与a = range(100)的模解决方案进行比较:

In [2]: %timeit [x for start in range(0, len(a), 30)
                   for x in a[start:start + 10]]
100000 loops, best of 3: 4.89 us per loop

In [3]: %timeit [e for i, e in enumerate(a) if i % 30 < 10]
100000 loops, best of 3: 14.8 us per loop

答案 2 :(得分:4)

不幸的是,我认为切片无法做到。我使用list comprehensions

解决了这个问题
>>> a = range(100)
>>> a
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 
 10, 11, 12, 13, 14, 15, 16, 17, 18, 19,
    ...
 90, 91, 92, 93, 94, 95, 96, 97, 98, 99]
>>> [e for i, e in enumerate(a) if i % 30 < 10]
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 
 30, 31, 32, 33, 34, 35, 36, 37, 38, 39,
 60, 61, 62, 63, 64, 65, 66, 67, 68, 69,
 90, 91, 92, 93, 94, 95, 96, 97, 98, 99]

答案 3 :(得分:1)

我不知道你是否只使用数字,但是如果你坚持下去,那么如果你坚持下去就有更快的方法。但是,只有当你的列表由相同长度的子列表组成时,以下内容才有效。

进行比较:

RecyclerView.ViewHolder

答案 4 :(得分:0)

我使用循环:

#!/usr/bin/env python


def magicslicer(l, stepsize, stepgap):
    output = []
    i = 0
    while i<len(l):
        output += l[i:i+stepsize]
        i += stepsize + stepgap
    return output


mylist = range(100)
print magicslicer(mylist,10,20)

答案 5 :(得分:0)

>>>[mylist[start:start+10] for start in mylist[::30]]
>>>[[0, 1, 2, 3, 4, 5, 6, 7, 8, 9], [30, 31, 32, 33, 34, 35, 36, 37, 38, 39], [60, 61, 62, 63, 64, 65, 66, 67, 68, 69], [90, 91, 92, 93, 94, 95, 96, 97, 98, 99]]

但我获得了一份清单:(

答案 6 :(得分:0)

[x for x in range(100) if x%30 < 10]是另一种方法。但是,随着列表大小的增加,这可能会很慢。

同一行的功能

def magic_slice(n, no_elems, step):
    s = no_elems + step
    return [x for x in range(n) if x%s < no_elems]

答案 7 :(得分:0)

mylist = range(100)

otherlist = ['21','31','689','777','479','51','71','yut','poi','ger',
             '11','61','789','zozozozo','8888','1']



def magic_slicer(iterable,keep,throw):
        it = iter(iterable).next
        for n in xrange((len(iterable)//keep+throw)+1):
                for i in xrange(keep):  yield it()
                for i in xrange(throw):  it()

print list(magic_slicer(mylist,10,20))
print
print list(magic_slicer(otherlist,2,3))


print '__________________'


def magic_slicer2(iterable,keep,throw):
        return ( x for i,x in enumerate(iterable) if -1< i%(keep+throw)<keep) 

print list(magic_slicer2(mylist,10,20))
print
print list(magic_slicer2(otherlist,2,3))

结果

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99]

['21', '31', '51', '71', '11', '61', '1']
__________________
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99]

['21', '31', '51', '71', '11', '61', '1']

答案 8 :(得分:0)

In [1]: mylist = [0,1,2,3,4,5,6,7,8,9]                               

In [2]: mylist[1:3]                                                  
Out[2]: [1, 2]

In [3]: mylist[5:7]                                                  
Out[3]: [5, 6]

In [4]: mylist[1:3] + mylist[5:7]                                    
Out[4]: [1, 2, 5, 6]