Question

我试图在给定列表的情况下累积多次出现的项目索引列表。不确定如何执行此操作，因为我的代码只会在终止之前比较pattern[1]和pattern[2]。

def test(pattern):
     """(list) -> list of int

     >>> test(['A', 'B', 'A', 'C', 'A'])
     [0, 2, 4]
     >>> test(['A', 'B'])
     []
     """
     indices = []
     new_list = []

     for i in range(len(pattern) - 1): 
          if pattern[i][-1] == pattern[i + 1]:  
              indices.append(i)
              new_list = phoneme_list[max(indices):]

      return new_list

Answer 1

>>> lst = ['A', 'B', 'A', 'C', 'A']
>>> [i for i in range(len(lst)) if lst.count(lst[i]) > 1]
[0, 2, 4]

也就是说，汇总一个索引列表可能表明你的算法可以改进。

Answer 2

我会这样做：

import collections

d = collections.defaultdict(list)

for idx,el in enumerate(lst):
    d[el].append(idx)
    # you can also do this with d as a normal dict, and instead do
    # d.setdefault(el, []).append(idx)

# this builds {'A':[0,1,3], 'B':[2,4], 'C':[5]} from
# ['A','A','B','A','B','C']

result = [idx for idxs in d.values() for idx in idxs if len(idxs) > 1]
# this builds [0,1,3,2,4] from
# {'A':[0,1,3], 'B':[2,4], 'C':[5]}

它还避免了调用list.count n次的需要，这对于更大的数据集应该更快地执行。

或者，您可以利用collections.Counter获取多次发生的所有值，然后立即拉出所有索引。

import collections

c = set([el for el,val in collections.Counter(lst).items() if val>1])
# gives {'A','B'}
result = [idx for idx,el in enumerate(lst) if el in c]
# gives [1,2,3,4]

查找列表中多个已发生项的索引

2 个答案: