在这个other SO post中,Python用户询问如何对连续数字进行分组,以便任何序列都可以由其开始/结束来表示,并且任何落后者都将显示为单个项目。接受的答案可以很好地用于连续序列。
我需要能够适应类似的解决方案,但需要一系列具有潜在(并非总是)变化增量的数字。理想情况下,我代表的方式也包括增量(因此他们知道是否每隔3,4,5,n)
引用原始问题,用户要求输入以下输入
[2, 3, 4, 5, 12, 13, 14, 15, 16, 17, 20] # input
[(2,5), (12,17), 20]
我想要的是以下内容(注意:为了清晰起见,我写了一个元组作为输出,但是使用其步长变量优先选择xrange):
[2, 3, 4, 5, 12, 13, 14, 15, 16, 17, 20] # input
[(2,5,1), (12,17,1), 20] # note, the last element in the tuple would be the step value
它还可以处理以下输入
[2, 4, 6, 8, 12, 13, 14, 15, 16, 17, 20] # input
[(2,8,2), (12,17,1), 20] # note, the last element in the tuple would be the increment
我知道xrange()
支持一个步骤,因此甚至可以使用其他用户答案的变体。我尝试根据他们在解释中写的内容进行一些编辑,但我无法得到我想要的结果。
对于不想点击原始链接的任何人,Nadia Alramli最初发布的代码为:
ranges = []
for key, group in groupby(enumerate(data), lambda (index, item): index - item):
group = map(itemgetter(1), group)
if len(group) > 1:
ranges.append(xrange(group[0], group[-1]))
else:
ranges.append(group[0])
答案 0 :(得分:4)
itertools
pairwise recipe是解决问题的一种方法。应用itertools.groupby
,可以创建数学差异相等的对的组。然后为多项目组选择每个组的第一个和最后一个项目,或者为单个组选择最后一个项目:
from itertools import groupby, tee, izip
def pairwise(iterable):
"s -> (s0,s1), (s1,s2), (s2, s3), ..."
a, b = tee(iterable)
next(b, None)
return izip(a, b)
def grouper(lst):
result = []
for k, g in groupby(pairwise(lst), key=lambda x: x[1] - x[0]):
g = list(g)
if len(g) > 1:
try:
if g[0][0] == result[-1]:
del result[-1]
elif g[0][0] == result[-1][1]:
g = g[1:] # patch for duplicate start and/or end
except (IndexError, TypeError):
pass
result.append((g[0][0], g[-1][-1], k))
else:
result.append(g[0][-1]) if result else result.append(g[0])
return result
审判: input -> grouper(lst) -> output
Input: [2, 3, 4, 5, 12, 13, 14, 15, 16, 17, 20]
Output: [(2, 5, 1), (12, 17, 1), 20]
Input: [2, 4, 6, 8, 12, 13, 14, 15, 16, 17, 20]
Output: [(2, 8, 2), (12, 17, 1), 20]
Input: [2, 4, 6, 8, 12, 12.4, 12.9, 13, 14, 15, 16, 17, 20]
Output: [(2, 8, 2), 12, 12.4, 12.9, (13, 17, 1), 20] # 12 does not appear in the second group
更新 :( 重复开始和/或结束值的补丁)
s1 = [i + 10 for i in xrange(0, 11, 2)]; s2 = [30]; s3 = [i + 40 for i in xrange(45)]
Input: s1+s2+s3
Output: [(10, 20, 2), (30, 40, 10), (41, 84, 1)]
# to make 30 appear as an entry instead of a group change main if condition to len(g) > 2
Input: s1+s2+s3
Output: [(10, 20, 2), 30, (41, 84, 1)]
Input: [2, 4, 6, 8, 10, 12, 13, 14, 15, 16, 17, 20]
Output: [(2, 12, 2), (13, 17, 1), 20]
答案 1 :(得分:2)
这是一个快速撰写(并且非常难看)的答案:
def test(inArr):
arr=inArr[:] #copy, unnecessary if we use index in a smart way
result = []
while len(arr)>1: #as long as there can be an arithmetic progression
x=[arr[0],arr[1]] #take first two
arr=arr[2:] #remove from array
step=x[1]-x[0]
while len(arr)>0 and x[1]+step==arr[0]: #check if the next value in array is part of progression too
x[1]+=step #add it
arr=arr[1:]
result.append((x[0],x[1],step)) #append progression to result
if len(arr)==1:
result.append(arr[0])
return result
print test([2, 4, 6, 8, 12, 13, 14, 15, 16, 17, 20])
这会返回[(2, 8, 2), (12, 17, 1), 20]
慢,因为它复制了一个列表并从中删除了元素
它只能找到完整的进度,并且只能在排序的数组中找到。
简而言之,它很糟糕,但应该有用;)
还有其他(更酷,更pythonic)方法,例如你可以将列表转换为集合,继续删除两个元素,计算它们的算术级数并与集合相交。
您还可以重复使用您提供的答案来检查某些步长。 e.g:
ranges = []
step_size=2
for key, group in groupby(enumerate(data), lambda (index, item): step_size*index - item):
group = map(itemgetter(1), group)
if len(group) > 1:
ranges.append(xrange(group[0], group[-1]))
else:
ranges.append(group[0])
查找步长 2 的每个群组,但仅限于
。答案 2 :(得分:2)
您可以创建一个迭代器来帮助分组,并尝试从下一组中提取下一个元素,该组将是上一组的结尾:
def ranges(lst):
it = iter(lst)
next(it) # move to second element for comparison
grps = groupby(lst, key=lambda x: (x - next(it, -float("inf"))))
for k, v in grps:
i = next(v)
try:
step = next(v) - i # catches single element v or gives us a step
nxt = list(next(grps)[1])
yield xrange(i, nxt.pop(0), step)
# outliers or another group
if nxt:
yield nxt[0] if len(nxt) == 1 else xrange(nxt[0], next(next(grps)[1]), nxt[1] - nxt[0])
except StopIteration:
yield i # no seq
给你:
In [2]: l1 = [2, 3, 4, 5, 8, 10, 12, 14, 13, 14, 15, 16, 17, 20, 21]
In [3]: l2 = [2, 4, 6, 8, 12, 13, 14, 15, 16, 17, 20]
In [4]: l3 = [13, 14, 15, 16, 17, 18]
In [5]: s1 = [i + 10 for i in xrange(0, 11, 2)]
In [6]: s2 = [30]
In [7]: s3 = [i + 40 for i in xrange(45)]
In [8]: l4 = s1 + s2 + s3
In [9]: l5 = [1, 2, 5, 6, 9, 10]
In [10]: l6 = {1, 2, 3, 5, 6, 9, 10, 13, 19, 21, 22, 23, 24}
In [11]:
In [11]: for l in (l1, l2, l3, l4, l5, l6):
....: print(list(ranges(l)))
....:
[xrange(2, 5), xrange(8, 14, 2), xrange(13, 17), 20, 21]
[xrange(2, 8, 2), xrange(12, 17), 20]
[xrange(13, 18)]
[xrange(10, 20, 2), 30, xrange(40, 84)]
[1, 2, 5, 6, 9, 10]
[xrange(1, 3), 5, 6, 9, 10, 13, 19, xrange(21, 24)]
当步骤为1
时,它不包含在xrange输出中。
答案 3 :(得分:0)
我曾经遇到过这种情况。在这里。
if (Input.GetKeyDown (KeyCode.F)) {
freeLook = !freeLook;
}