如何删除列表或字符串中的连续值

时间:2015-06-08 08:38:15

标签: python algorithm

我想知道如何删除列表或字符串中的连续值。

如果我的清单是:

mylist = ["N","N","J","N","J","S","S","K","K","K","A","K"]

我应该得到:

["N","J","N","J","S","K","A","K"]

5 个答案:

答案 0 :(得分:4)

您可以使用列表推导

>>> mylist = ["N","N","J","N","J","S","S","K","K","K","A","K"]
>>> [j for i, j in enumerate(mylist) if j != mylist[i-1] or i == 0]
['N', 'J', 'N', 'J', 'S', 'K', 'A', 'K']

答案 1 :(得分:3)

您可以使用itertools中的groupby

In [28]: from itertools import groupby

In [30]: lst
Out[30]: ['N', 'N', 'J', 'N', 'J', 'S', 'S', 'K', 'K', 'K', 'A', 'K']

In [31]: [elem[0] for elem in groupby(lst)]
Out[31]: ['N', 'J', 'N', 'J', 'S', 'K', 'A', 'K']

性能

In [33]: %timeit [j for i, j in enumerate(lst) if j != lst[i-1]]
100000 loops, best of 3: 2.8 µs per loop

In [34]: %timeit [elem[0] for elem in groupby(lst)]
100000 loops, best of 3: 2.55 µs per loop

In [36]: %timeit list(map(lambda x: x[0], filter(lambda x: x[0] != x[1], zip(lst,lst[1:]+['']))))
100000 loops, best of 3: 9.35 µs per loop

答案 2 :(得分:0)

使用mapfilterzip非常有启发性:

>>> list(map(lambda x: x[0], filter(lambda x: x[0] != x[1], zip(mylist,mylist[1:]+['']))))
['N', 'J', 'N', 'J', 'S', 'K', 'A', 'K']

答案 3 :(得分:0)

您正在寻找unique_justseen文档中的itertools食谱:

def unique_justseen(iterable, key=None):
    "List unique elements, preserving order. Remember only the element just seen."
    # unique_justseen('AAAABBBCCDAABBB') --> A B C D A B
    # unique_justseen('ABBCcAD', str.lower) --> A B C A D
    return map(next, map(itemgetter(1), groupby(iterable, key)))

来源:https://docs.python.org/3/library/itertools.html#itertools-recipes

答案 4 :(得分:0)

<强> 1。列表问题

你可以使用Python内置的reduce函数来做到这一点,假设你的列表中至少有一个元素:

reduce(lambda lst, el: lst if lst[-1] == el else lst + [el], mylist[1:], [mylist[0]])

所以你在这里基本上做的是初始化一个包含原始列表的第一个元素的新列表。然后使用reduce逐个迭代其余元素并将提供的函数应用于reduce。该函数所做的只是检查当前元素是否等于聚合元素的最后一个元素。如果它相等,则通过仅返回聚合列表来忽略当前元素,否则,它会将其附加到聚合列表并返回。

<强> 2。括号问题

至于括号问题,您可以使用filter内置函数,以经典方式处理自定义堆栈处理:

print(filter(escaped, mystr))

其中escaped以下列方式定义:

bracket_stack = []
def escaped(c):
    ignore = False
    if c in ['(', '[', '<']:
        bracket_stack.append(c)
        ignore = True
    elif bracket_stack and c in [')', ']', '>']:
        ignore = True
        if c == ')' and bracket_stack[-1] == '(':
            bracket_stack.pop()
        if c == ']' and bracket_stack[-1] == '[':
            bracket_stack.pop()
        if c == '>' and bracket_stack[-1] == '<':
            bracket_stack.pop()

    in_brackets = len(bracket_stack)
    return not (ignore or in_brackets)