我有一个由1&0和#0组成的列表,例如
[0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 1, 0]
我想输出另一个相同长度的列表,其中每个条目代表刚刚消失的连续0的数量,即上面例子的输出将是:
[0, 1, 2, 3, 4, 0, 0, 0, 0, 0, 1, 0]
请注意,输出列表的第一个条目始终为0
,并且输入列表的最后一个条目并不重要。
到目前为止我已尝试过:
def zero_consecutive(input_list):
output = [0]
cons = 0
for i in input_list[:-1]:
if i == 0:
cons += 1
output.append(cons)
else:
cons = 0
output.append(cons)
return output
它适用于该示例,但可能有更有效的方法来涵盖更多边缘情况。
答案 0 :(得分:6)
您可以编写生成器函数,然后将其强制转换为append
,而不是list
列表中所有内容的函数。一般来说,它更短,在大多数情况下甚至更快(同时做同样的事情)!
def zero_consecutive(input_list):
yield 0
cons = 0
for i in input_list[:-1]:
if i == 0:
cons += 1
else:
cons = 0
yield cons
>>> list(zero_consecutive([0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 1, 0]))
[0, 1, 2, 3, 4, 0, 0, 0, 0, 0, 1, 0]
答案 1 :(得分:4)
你说你真的对一个非常快速的解决方案感兴趣。如果性能至关重要,您可以使用C扩展类型,例如使用Cython。
我正在使用IPython,所以我只使用cythonmagic:
%load_ext cython
让Cython编译这个迭代器类:
%%cython
cdef class zero_consecutive_cython(object):
cdef long cons
cdef object input_list
cdef int started
def __init__(self, input_list):
self.input_list = iter(input_list[:-1])
self.cons = 0
self.started = 0
def __iter__(self):
return self
def __next__(self):
if self.started == 0:
self.started = 1
return 0
item = next(self.input_list)
if item == 0:
self.cons += 1
else:
self.cons = 0
return self.cons
它与the other answer中提到的生成器函数基本相同,但它更快:
import numpy as np
def zero_consecutive_numpy(input_list): # from https://stackoverflow.com/a/45905344/5393381
a = np.array(input_list)
idx = np.flatnonzero(a[1:] != a[:-1])+2
out = np.ones(a.size,dtype=int)
out[0] = 0
if len(idx)==0:
out = np.arange(a.size)
elif len(idx)==1:
out[idx[0]] = -a.size
np.cumsum(out, out=out)
out[out<0] = 0
else:
out[idx[0]] = 2-idx[1]
if len(idx)%2==1:
out[idx[-1]] = -a.size
out[idx[2:-1:2]] = 1-idx[3:-1:2] - idx[1:-3:2]
else:
out[idx[2::2]] = 1-idx[3::2] - idx[1:-2:2]
np.cumsum(out, out=out)
out[out<0] = 0
return out
def zero_consecutive_python(input_list): # from https://stackoverflow.com/a/45904440/5393381
yield 0
cons = 0
for i in input_list[:-1]:
if i == 0:
cons += 1
else:
cons = 0
yield cons
np.random.seed(0)
for n in [200, 2000, 20000, 100000]:
print(n)
a = np.repeat(np.arange(n)%2, np.random.randint(3,8,(n))).tolist()
%timeit list(zero_consecutive_python(a))
%timeit list(zero_consecutive_cython(a))
%timeit zero_consecutive_numpy(a)
给我这个结果:
200
380 µs ± 13.5 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) # python
122 µs ± 1.06 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each) # cython
488 µs ± 7.35 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) # numpy
2000
3.49 ms ± 26.3 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) # python
1.07 ms ± 19.5 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) # cython
3.85 ms ± 288 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) # numpy
20000
42.9 ms ± 3.03 ms per loop (mean ± std. dev. of 7 runs, 10 loops each) # python
15 ms ± 778 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) # cython
33.9 ms ± 670 µs per loop (mean ± std. dev. of 7 runs, 10 loops each) # numpy
100000
199 ms ± 2.69 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) # python
77.8 ms ± 507 µs per loop (mean ± std. dev. of 7 runs, 10 loops each) # cython
173 ms ± 4.37 ms per loop (mean ± std. dev. of 7 runs, 10 loops each) # numpy
至少在我的电脑上,似乎这可以比其他方法击败2-3倍。
答案 2 :(得分:3)
这是一个矢量化解决方案 -
def zero_consecutive_vectorized(input_list):
a = np.array(input_list)
idx = np.flatnonzero(a[1:] != a[:-1])+2
out = np.ones(a.size,dtype=int)
out[0] = 0
if len(idx)==0:
out = np.arange(a.size)
elif len(idx)==1:
out[idx[0]] = -a.size
np.cumsum(out, out=out)
out[out<0] = 0
else:
out[idx[0]] = 2-idx[1]
if len(idx)%2==1:
out[idx[-1]] = -a.size
out[idx[2:-1:2]] = 1-idx[3:-1:2] - idx[1:-3:2]
else:
out[idx[2::2]] = 1-idx[3::2] - idx[1:-2:2]
np.cumsum(out, out=out)
out[out<0] = 0
return out
示例运行 -
In [493]: a = [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
In [494]: zero_consecutive_vectorized(a)
Out[494]: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]
In [495]: a = [0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
In [496]: zero_consecutive_vectorized(a)
Out[496]: [0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
In [497]: a = [0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 1, 0]
In [498]: zero_consecutive_vectorized(a)
Out[498]: [0, 1, 2, 3, 4, 0, 0, 0, 0, 0, 1, 0]
运行时测试
针对@ MSeifert解决方案的时间安排,这个解决方案似乎与众多环路解决方案竞争激烈 -
In [579]: n = 10000
In [580]: a = np.repeat(np.arange(n)%2, np.random.randint(3,8,(n))).tolist()
In [581]: %timeit list(zero_consecutive(a))
...: %timeit zero_consecutive_vectorized(a)
...:
100 loops, best of 3: 2.85 ms per loop
100 loops, best of 3: 1.96 ms per loop
In [582]: n = 60000
In [583]: a = np.repeat(np.arange(n)%2, np.random.randint(3,8,(n))).tolist()
In [584]: %timeit list(zero_consecutive(a))
...: %timeit zero_consecutive_vectorized(a)
...:
100 loops, best of 3: 17.2 ms per loop
100 loops, best of 3: 12 ms per loop
答案 3 :(得分:2)
这有效:
def zero_consecutive(a):
y = []
for i, _ in enumerate(a):
#prevents a StopIteration error
if not(1 in a[:i]): y.append(i)
else:
index = next(j for j in range(i-1, -1, -1) if a[j])
y.append(i - index - 1)
return y
答案 4 :(得分:2)
以下是使用itertools.groupby
检测零游程的方法:
from itertools import groupby
def zero_consecutive(input_list):
result = [0]
for k, values in groupby(input_list[:-1], bool):
len_values = len(list(values))
if k:
result.extend([0] * len_values)
else:
result.extend(range(1, len_values + 1))
return result
>>> zero_consecutive([0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 1, 0])
[0, 1, 2, 3, 4, 0, 0, 0, 0, 0, 1, 0]
通过使用lambda表达式x == 0
作为键进行分组,以便等效地处理非零值。这意味着该函数适用于包含0和1以外值的列表,例如:
>>> zero_consecutive([0, 0, 0, 0, 1, 2, 'a', 2, 1000, 0, 1, 0])
[0, 1, 2, 3, 4, 0, 0, 0, 0, 0, 1, 0]
答案 5 :(得分:1)
另一种使用numpy
和scipy
的解决方案,以获得乐趣
import numpy as np
from scipy.ndimage.measurements import label
from scipy.ndimage.interpolation import shift
a = np.array([0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 1, 0])
a_zeros = a == 0
labels = label(a_zeros)[0]
for l in np.unique(labels):
a[labels == l] = a_zeros[labels == l].cumsum()
shift(a, 1, output=a)
>>> a
Out[1]:
array([0, 1, 2, 3, 4, 0, 0, 0, 0, 0, 1, 0])
如果你想要它的功能。
def zero_consecutive(array):
a = array.copy()
a_zeros = a == 0
labels = label(a_zeros)[0]
for l in np.unique(labels):
a[labels == l] = a_zeros[labels == l].cumsum()
shift(a, 1, output=a)
return a
编辑:改进版
更好的表现。
import numpy as np
from scipy.ndimage.measurements import label
from scipy.ndimage.interpolation import shift
from scipy.ndimage.measurements import labeled_comprehension
def zero_consecutive(array):
def func(a, idx):
r[idx] = a.astype(bool).cumsum()
return True
r = np.zeros_like(array)
labels, nlabels = label(array == 0)
labeled_comprehension(labels, labels, np.arange(1, nlabels + 1), func, int, 0, pass_positions=True)
return shift(r, 1)
答案 6 :(得分:-2)
list(map(int,list(''.join(['0' if elem=='' else ''.join(map(str,list(range(len(elem)+1)))) for elem in str([0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 1, 0]).strip('[').strip(']').replace(', ','').split('1')])[0:-1])))
这个列表理解怎么样。