我有一个布尔(numpy)数组。而且我想知道Falses之间出现了多少次'True'。
例如,对于样本列表:
b_List = [T,T,T,F,F,F,F,T,T,T,F,F,T,F]
应该产生
ml = [3,3,1]
我最初的尝试是尝试这个代码段:
i = 0
ml = []
for el in b_List:
if (b_List):
i += 1
ml.append(i)
i = 0
但它会在b_List中为每个F添加以ml为单位的元素。
修改
谢谢大家的回答。可悲的是,我可以'接受所有答案都是正确的。我接受了Akavall的答案,因为他提到了我最初的尝试(我知道我现在做错了什么),并且还对Mark和Ashwinis的帖子进行了比较。
请不要将接受的解决方案作为定义答案,因为其他建议都引入了同样适用的替代方法
答案 0 :(得分:5)
itertools.groupby提供了一种简单的方法:
>>> import itertools
>>> T, F = True, False
>>> b_List = [T,T,T,F,F,F,F,T,T,T,F,F,T,F]
>>> [len(list(group)) for value, group in itertools.groupby(b_List) if value]
[3, 3, 1]
答案 1 :(得分:4)
使用NumPy
:
>>> import numpy as np
>>> a = np.array([ True, True, True, False, False, False, False, True, True, True, False, False, True, False], dtype=bool)
>>> np.diff(np.insert(np.where(np.diff(a)==1)[0]+1, 0, 0))[::2]
array([3, 3, 1])
>>> a = np.array([True, False, False, True, True, False, False, True, False])
>>> np.diff(np.insert(np.where(np.diff(a)==1)[0]+1, 0, 0))[::2]
array([1, 2, 1])
不能说这是最好的NumPy解决方案,但它仍然比itertools.groupby
更快:
>>> lis = [ True, True, True, False, False, False, False, True, True, True, False, False, True, False]*1000
>>> a = np.array(lis)
>>> %timeit [len(list(group)) for value, group in groupby(lis) if value]
100 loops, best of 3: 9.58 ms per loop
>>> %timeit np.diff(np.insert(np.where(np.diff(a)==1)[0]+1, 0, 0))[::2]
1000 loops, best of 3: 1.4 ms per loop
>>> lis = [ True, True, True, False, False, False, False, True, True, True, False, False, True, False]*10000
>>> a = np.array(lis)
>>> %timeit [len(list(group)) for value, group in groupby(lis) if value]
1 loops, best of 3: 95.5 ms per loop
>>> %timeit np.diff(np.insert(np.where(np.diff(a)==1)[0]+1, 0, 0))[::2]
100 loops, best of 3: 14.9 ms per loop
正如@justhalf和@Mark Dickinson在评论中指出的那样,上述代码在某些情况下不起作用,因此您需要先在两端添加False
:
In [28]: a
Out[28]:
array([ True, True, True, False, False, False, False, True, True,
True, False, False, True, False], dtype=bool)
In [29]: np.diff(np.where(np.diff(np.hstack([False, a, False])))[0])[::2]
Out[29]: array([3, 3, 1])
答案 2 :(得分:2)
你原来的尝试有一些问题:
i = 0
ml = []
for el in b_List:
if (b_List): # b_list is a list and will evaluate to True
# unless you have an empty list, you want if (el)
i += 1
ml.append(i) # even if the above line was correct you still get here
# on every iteration, and you don't want that
i = 0
你可能想要这样的东西:
def count_Trues(b_list):
i = 0
ml = []
prev = False
for el in b_list:
if el:
i += 1
prev = el
else:
if prev is not el:
ml.append(i)
i = 0
prev = el
if el:
ml.append(i)
return m
结果:
>>> T, F = True, False
>>> b_List = [T,T,T,F,F,F,F,T,T,T,F,F,T,F]
>>> count_Trues(b_List)
[3, 3, 1]
>>> b_List.extend([T,T])
>>> count_Trues(b_List)
[3, 3, 1, 2]
>>> b_List.extend([F])
>>> count_Trues(b_List)
[3, 3, 1, 2]
此解决方案运行得非常快:
In [5]: T, F = True, False
In [6]: b_List = [T,T,T,F,F,F,F,T,T,T,F,F,T,F]
In [7]: new_b_List = b_List * 100
In [8]: import numpy as np
# Ashwini Chaudhary's Solution
In [9]: %timeit np.diff(np.insert(np.where(np.diff(new_b_List)==1)[0]+1, 0, 0))[::2]
1000 loops, best of 3: 299 us per loop
In [11]: %timeit count_Trues(new_b_List)
1000 loops, best of 3: 130 us per loop
In [12]: new_b_List = b_List * 1000
# Ashwini Chaudhary's Solution
In [13]: %timeit np.diff(np.insert(np.where(np.diff(new_b_List)==1)[0]+1, 0, 0))[::2]
100 loops, best of 3: 2.25 ms per loop
In [14]: %timeit count_Trues(new_b_List)
100 loops, best of 3: 1.33 ms per loop