查找列表中多个已发生项的索引

时间:2015-04-01 23:26:35

标签: python python-3.x

我试图在给定列表的情况下累积多次出现的项目索引列表。不确定如何执行此操作,因为我的代码只会在终止之前比较pattern[1]pattern[2]

def test(pattern):
     """(list) -> list of int

     >>> test(['A', 'B', 'A', 'C', 'A'])
     [0, 2, 4]
     >>> test(['A', 'B'])
     []
     """
     indices = []
     new_list = []

     for i in range(len(pattern) - 1): 
          if pattern[i][-1] == pattern[i + 1]:  
              indices.append(i)
              new_list = phoneme_list[max(indices):]

      return new_list

2 个答案:

答案 0 :(得分:3)

>>> lst = ['A', 'B', 'A', 'C', 'A']
>>> [i for i in range(len(lst)) if lst.count(lst[i]) > 1]
[0, 2, 4]

也就是说,汇总一个索引列表可能表明你的算法可以改进。

答案 1 :(得分:2)

我会这样做:

import collections

d = collections.defaultdict(list)

for idx,el in enumerate(lst):
    d[el].append(idx)
    # you can also do this with d as a normal dict, and instead do
    # d.setdefault(el, []).append(idx)

# this builds {'A':[0,1,3], 'B':[2,4], 'C':[5]} from
# ['A','A','B','A','B','C']

result = [idx for idxs in d.values() for idx in idxs if len(idxs) > 1]
# this builds [0,1,3,2,4] from
# {'A':[0,1,3], 'B':[2,4], 'C':[5]}

它还避免了调用list.count n次的需要,这对于更大的数据集应该更快地执行。

或者,您可以利用collections.Counter获取多次发生的所有值,然后立即拉出所有索引。

import collections

c = set([el for el,val in collections.Counter(lst).items() if val>1])
# gives {'A','B'}
result = [idx for idx,el in enumerate(lst) if el in c]
# gives [1,2,3,4]