Question

我有一个重复值的列表，如下所示：

x = [1, 1, 1, 2, 2, 2, 1, 1, 1]

此列表是从匹配正则表达式的模式生成的（此处未显示）。该列表保证具有重复值（许多重复 - 数百，如果不是数千），并且从不随机排列，因为这是正则表达式每次匹配的内容。

我想要的是 跟踪条目从上一个值 更改的列表索引。因此，对于上面的列表x，我想获得一个更改跟踪列表[3, 6]，表明x[3]和x[6]与列表中之前的条目不同。

我设法做到了这一点，但我想知道是否有更清洁的方式。这是我的代码：

x = [1, 1, 1, 2, 2, 2, 1, 1, 1]

flag = []
for index, item in enumerate(x):
    if index != 0:
        if x[index] != x[index-1]:
            flag.append(index)

print flag

输出：[3, 6]

问题：在更少的代码行中，是否有更简洁的方法来做我想要的事情？

Answer 1

可以使用列表推导来完成，使用range函数

>>> x = [1, 1, 1, 2, 2, 2, 3, 3, 3]
>>> [i for i in range(1,len(x)) if x[i]!=x[i-1] ]
[3, 6]
>>> x = [1, 1, 1, 2, 2, 2, 1, 1, 1]
>>> [i for i in range(1,len(x)) if x[i]!=x[i-1] ]
[3, 6]

Answer 2

您可以使用itertools.izip，itertools.tee和列表理解来执行此类操作：

from itertools import izip, tee
it1, it2 = tee(x)
next(it2)
print [i for i, (a, b) in enumerate(izip(it1, it2), 1) if a != b]
# [3, 6]

在itertools.groupby上使用enumerate(x)的另一种方法。 groupby将类似的项目组合在一起，因此我们需要的是除了第一项之外每个组的第一项的索引：

from itertools import groupby
from operator import itemgetter
it = (next(g)[0] for k, g in groupby(enumerate(x), itemgetter(1)))
next(it) # drop the first group
print list(it)
# [3, 6]

如果NumPy是一个选项：

>>> import numpy as np
>>> np.where(np.diff(x) != 0)[0] + 1
array([3, 6])

Answer 3

而是具有O(n)复杂度的多索引，您可以使用迭代器来检查列表中的下一个元素：

>>> x =[1, 1, 1, 2, 2, 2, 3, 3, 3]
>>> i_x=iter(x[1:])
>>> [i for i,j in enumerate(x[:-1],1) if j!=next(i_x)]
[3, 6]

Answer 4

我在这里添加包含列表理解的强制性答案。

flag = [i+1 for i, value in enumerate(x[1:]) if (x[i] != value)]

Answer 5

itertools.izip_longest正是您所寻找的：

from itertools import islice, izip_longest

flag = []
leader, trailer = islice(iter(x), 1), iter(x)
for i, (current, previous) in enumerate(izip_longest(leader, trailer)):
    # Skip comparing the last entry to nothing
    # If None is a valid value use a different sentinel for izip_longest
    if leader is None:
        continue
    if current != previous:
        flag.append(i)

跟踪Python中重复列表中的值更改

5 个答案: