给出此列表:
a = ['a','b','b','b','c','c','d','e','e']
我想返回一个列表列表,每个列表包含如下值的开始和结束索引:
[[0,0], [1,3], [4,5], [6,6], [7,8]]
答案 0 :(得分:2)
使用itertools.groupby
(doc):
a = ['a','b','b','b','c','c','d','e','e']
from itertools import groupby
last_index = 0
out = []
for v, g in groupby(enumerate(a), lambda k: k[1]):
l = [*g]
out.append([last_index, l[-1][0]])
last_index += len(l)
print(out)
打印:
[[0, 0], [1, 3], [4, 5], [6, 6], [7, 8]]
答案 1 :(得分:1)
如果列表已排序
if len(a) == 0:
return []
result = []
firstSeenIndex, elementInWatch = 0,a[0]
for i,ele in enumerate(1,a[1:]):
if ele == elementInWatch:
continue
else:
result.append([firstSeenIndex,i-1])
firstSeenIndex = i
elementInWatch= ele
result.append([firstSeenIndex,len(a)-1]
return result
注意:有很多更好的方法可以做到,我希望这是直观的。
答案 2 :(得分:1)
同时使用itertools.groupby
和itertools.accumulate
,我们可以避免自己积累索引。
此外,这不会为原始数组中的每个元素添加额外的数据,而只会为每个组添加额外的数据。
尝试一下:
from itertools import groupby, accumulate
a = ['a', 'b', 'b', 'b', 'c', 'c', 'd', 'e', 'e']
lens = [len(list(g)) for _, g in groupby(a)]
result = [[accumulated_length-current_length, accumulated_length-1] for current_length, accumulated_length in zip(lens, accumulate(lens))]
print(result)
输出:
[[0, 0], [1, 3], [4, 5], [6, 6], [7, 8]]
答案 3 :(得分:1)
def start_stop_indice(a):
result = [] # init empty list
start_inx,end_inx,count = 0,0,0 # init indexs and counts to 0
# while the starting index plus the count of records <= the length of the list
while start_inx + count <= len(a):
# count is the number of times a record is in the list
count = a.count(a[start_inx])
# end_index is the starting index + number of occurances - 1
end_inx = start_inx + count - 1
# append a list of starting and ending indexs to the results list
result.append([start_inx,end_inx])
# add the count to the starting index to get next value
start_inx += count
return result
if __name__ == '__main__':
a = ['a','b','b','b','c','c','d','e','e']
print(start_stop_indice(a))