如果它们按特定顺序出现,则从列表中删除重复项

时间:2019-08-13 16:50:43

标签: python python-3.x

如果它们以特定顺序出现在列表中,我正在尝试从列表中删除它们

我尝试了以下代码;

 a = ["abc", "def", "ijk", "lmn", "opq", "rst", "xyz"]

 b = ["ijk", "123", "456","123", "rst", "xyz" ]
 counter=0
 for i in b[:]:
      print(i)
      counter=counter+1
      print(counter)
      if i in a and i in a[counter+2]:
            print(a)
            print(">>>>>",a[counter+2])
            b.remove(i)

  print(b)

我正在寻找以下输出

b = [“ ijk”,“ 123”,“ 456”,“ 123”]

从b中删除了[“ rst”,“ xyz”],因为它们在a中处于后2个反向序列中。

2 个答案:

答案 0 :(得分:0)

如果python calculator.py double 10 # 20 python calculator.py double --number=15 # 30 中的项目仅在其中出现一次,则该解决方案有效:

  • 我们首先创建一个字典,将a的项作为键,并将其索引作为值
  • 使用a,如果itertools.groupby的项按顺序出现在b中,则将它们分组,这对应于常数值{a中的索引)-( {{1}中的索引)。
  • 我们只保留长度为1的序列

因此,代码:

b

答案 1 :(得分:0)

这是通过使用成对查找来满足“序列”条件的一种方法。

这个想法是在您的“ a”或lookup_list中预先构造一组所有对的集合。之后,使用配对来遍历b。如果找到一对,则设置标志以跳过两个元素(当前元素和下一个元素)。否则,请附加第一项,因为可以确保它不会与下一项依次出现。

演示:

from itertools import zip_longest

def remove_dupes_in_seq(lookup_list, b):
    ''' lookup_list: list from which you need to check for sequences
    b: list from which you need to get the output that 
    has all elements of b that do not occur in a sequence in lookup_list.
    '''
    pair_set_lookup = set(zip(lookup_list, lookup_list[1:]))
    #make a set of all pairs to check for sequences
    result = []
    skip_next_pair = False #boolean used to indicate that elements need to be skipped
    for pair in zip_longest(b, b[1:]):
        if skip_next_pair:
            #set the boolean according to current pair, then perform a skip
            skip_next_pair = pair in pair_set_lookup
            continue
        if pair in pair_set_lookup:
            #pair found. set flag to skip next element
            skip_next_pair = True
        else:
            #the first item is guaranteed to not occur in a sequence. Append it to output.
            result.append(pair[0])
    return result

a = ["abc", "def", "ijk", "lmn", "opq", "rst", "xyz"]
b = ["ijk", "123", "456","123", "rst", "xyz" ]
out = remove_dupes_in_seq(a, b) #['ijk', '123', '456', '123']
b2 = ["ijk","lmn","456","123","rst","xyz"]
out2 = remove_dupes_in_seq(a, b2) #['456', '123']