Question

我正在尝试确定如何在可迭代对象上有效地实现生成器，该可迭代对象在定义的窗口内产生所有先行或后备对。

例如

seq = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
pairs_lookahead(seq, behind=0, forward=3, empty=None)

应该产生类似于

[((1, 2), (1, 3), (1, 4)), ((2, 3), (2, 4), (2, 5)), ...]

在向前或向后查找过程中不存在某个元素时，应使用定义的空值填充该元素。

到目前为止，这是我所准备的超前发电机

def lookforward(seq, behind, forward, empty=None):
    itr = iter(seq)

    lst = [empty]*behind + [next(itr)]

    # Prime the needed lookforward values:
    for x in range(forward):
        try:
            lst.append(next(itr))
        except StopIteration:
            lst.append(empty)
            forward -= 1

    # Yield the current tuple, then shift the list and generate a new item:
    for item in itr:
        yield tuple(lst)
        lst = lst[1:] + [item]

    # Yield the last full tuple, then continue with None for each lookforward
    # position:
    for x in range(forward + 1):
        yield tuple(lst)
        lst = lst[1:] + [empty]

print(list(lookforward(range(10), 0, 3)))

执行上述实现可以得到：

> [(0, 1, 2, 3), (1, 2, 3, 4), (2, 3, 4, 5), (3, 4, 5, 6), (4, 5, 6, 7), (5, 6, 7, 8), (6, 7, 8,9), (7, 8, 9, None), (8, 9, None, None), (9, None, None, None)]

我不确定如何从这里开始。上面的实现生成先行和先行序列，但是我不确定如何对其进行修改以生成对序列。我也担心我的实施效率可能不高。我对Python中的迭代器实现缺乏经验。任何帮助将不胜感激。

Answer 1

您可以尝试以下类似我编写的代码：

说明：

如果behind / forward为0，则将变量behind_counter / forward_counter设置为1，以使以下循环至少循环一次。

外部循环分别在seq范围内循环，两个内部循环分别在behind_counter（递减计数）和forward_counter（递增计数）的范围内。在最内部的循环中，我们设置了各自的先行索引/后置索引，然后借助三个if语句检查索引是否超出范围，以便将各个值设置为超出范围的值（{{ 1}}）。如果索引没有超出范围，则选择第四个'null'语句。在每个if语句中，有if条语句根据存储在if-elif-elif-else和behind中的超前/后退值来更改附加的元组。如果两者均为0，则仅追加forward，如果seq[i]为0，则仅追加由behind和当前超前值组成的元组，等等。

工作完成后，我们打印seq[i]的值以使结果可视化。

源代码：

res

输出：

seq = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
behind = 0
forward = 3

res = []

behind_counter = behind
forward_counter = forward

if behind == 0:
    behind_counter = 1
if forward == 0:
    forward_counter = 1

for i in range(len(seq)):
    for j in range(behind_counter,0,-1):
        for k in range(forward_counter):
            index_behind = i - j
            index_forward = i + k + 1
            if index_behind < 0 and index_forward > len(seq):
                index_behind = 'null'
                index_forward = 'null'
                if behind == 0 and forward == 0:
                    res.append(tuple((seq[i])))
                elif behind == 0:
                    res.append(tuple((seq[i],index_forward)))
                elif forward == 0:
                    res.append(tuple((index_behind,seq[i])))
                else:
                    res.append(tuple((index_behind,seq[i],index_forward)))
                continue
            if index_behind < 0:
                index_behind = 'null'
                if behind == 0 and forward == 0:
                    res.append(tuple((seq[i])))
                elif behind == 0:
                    res.append(tuple((seq[i],seq[index_forward])))
                elif forward == 0:
                    res.append(tuple((index_behind,seq[i])))
                else:
                    res.append(tuple((index_behind,seq[i],seq[index_forward])))
                continue
            if index_forward >= len(seq):
                index_forward = 'null'
                if behind == 0 and forward == 0:
                    res.append(tuple((seq[i])))
                elif behind == 0:
                    res.append(tuple((seq[i],index_forward)))
                elif forward == 0:
                    res.append(tuple((seq[index_behind],seq[i])))
                else:
                    res.append(tuple((seq[index_behind],seq[i],index_forward)))
                continue
            if index_forward < len(seq) and index_behind >= 0:
                if behind == 0 and forward == 0:
                    res.append(tuple((seq[i])))
                elif behind == 0:
                    res.append(tuple((seq[i],seq[index_forward])))
                elif forward == 0:
                    res.append(tuple((seq[index_behind],seq[i])))
                else:
                    res.append(tuple((seq[index_behind],seq[i],seq[index_forward])))
print (res)

第一次重大更新：

根据新注释，当您同时指定了向后/向前看时，您想要一个不同的输出，因此我更改了程序，希望现在可以满足您的需求：

已更新程序的其他说明：

在外循环的每次迭代中，先在循环中添加后向对，然后再在循环中添加前向对。和以前一样，我们检查边界并相应地设置lookbehind / forward值。

更新的源代码：

[(1, 2), (1, 3), (1, 4), (2, 3), (2, 4), (2, 5), (3, 4), (3, 5), (3, 6), (4, 5), (4, 6), (4, 7), (5, 6), (5, 7), (5, 8), (6, 7), (6, 8), (6, 9), (7, 8), (7, 9), (7, 10), (8, 9), (8, 10), (8, 'null'), (9, 10), (9, 'null'), (9, 'null'), (10, 'null'), (10, 'null'), (10, 'null')]

新输出：[在指定了和后向查找时]

seq = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
behind = 2
forward = 3

res = []

behind_counter = behind
forward_counter = forward

if behind == 0:
    behind_counter = 1
if forward == 0:
    forward_counter = 1

for i in range(len(seq)):
    for j in range(behind_counter,0,-1):
        index_behind = i - j
        if behind == 0:
            #res.append(tuple((seq[i])))
            continue
        else:
            if index_behind < 0:
                index_behind = 'null'
                res.append(tuple((seq[i],index_behind)))
                continue
            else:
                res.append(tuple((seq[i], seq[index_behind])))
    for k in range(forward_counter):
        index_forward = i + k + 1
        if forward == 0:
            #res.append(tuple((seq[i])))
            continue
        else:
            if index_forward >= len(seq):
                index_forward = 'null'
                res.append(tuple((seq[i],index_forward)))
                continue
            else:
                res.append(tuple((seq[i],seq[index_forward])))
print (res)

第二次重大更新：

如果您想要一个包含元组元组的列表，则可以执行以下操作[我对我的第一个主要更新的代码进行了少许修改]：

其他说明：

在每个外循环迭代的开始，我们将一个空列表附加到[(1, 'null'), (1, 'null'), (1, 2), (1, 3), (1, 4), (2, 'null'), (2, 1), (2, 3), (2, 4), (2, 5), (3, 1), (3, 2), (3, 4), (3, 5), (3, 6), (4, 2), (4, 3), (4, 5), (4, 6), (4, 7), (5, 3), (5, 4), (5, 6), (5, 7), (5, 8), (6, 4), (6, 5), (6, 7), (6, 8), (6, 9), (7, 5), (7, 6), (7, 8), (7, 9), (7, 10), (8, 6), (8, 7), (8, 9), (8, 10), (8, 'null'), (9, 7), (9, 8), (9, 10), (9, 'null'), (9, 'null'), (10, 8), (10, 9), (10, 'null'), (10, 'null'), (10, 'null')]。在此列表中，我们首先附加lookbehind对的值，然后附加lookforward对。在每个外部循环迭代的最后，我们将这个新创建的列表转换为元组。

res

输出：[包含元组元组的列表]

seq = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
behind = 2
forward = 3

res = []

behind_counter = behind
forward_counter = forward

if behind == 0:
    behind_counter = 1
if forward == 0:
    forward_counter = 1

for i in range(len(seq)):
    res.append(list())
    for j in range(behind_counter,0,-1):
        index_behind = i - j
        if behind == 0:
            #res.append(tuple((seq[i])))
            continue
        else:
            if index_behind < 0:
                index_behind = 'null'
                res[i].append((seq[i],index_behind))
                continue
            else:
                res[i].append((seq[i], seq[index_behind]))
    for k in range(forward_counter):
        index_forward = i + k + 1
        if forward == 0:
            #res.append(tuple((seq[i])))
            continue
        else:
            if index_forward >= len(seq):
                index_forward = 'null'
                res[i].append((seq[i],index_forward))
                continue
            else:
                res[i].append((seq[i],seq[index_forward]))
    res[i] = tuple(res[i])
print (res)

Answer 2

说明

我觉得最好的解决方案是使用尽可能多的可用工具。特别是在这种情况下，非常有趣的事情是使用zip（及其替代方法zip_longest）：

from itertools import zip_longest

seq = [1, 2, 3, 4]
print(list(zip(seq, seq[1:])))
print(list(zip_longest(seq, seq[1:])))

哪个会产生：

[(1, 2), (2, 3), (3, 4)]
[(1, 2), (2, 3), (3, 4), (4, None)]

还要注意，zip可用于“解压缩”：

print(list(zip(*[(1, 2), (2, 3), (3, 4)])))

输出：

[(1, 2, 3), (2, 3, 4)]

第一步是理解构建案例forward=2的这段代码：

from itertools import zip_longest

seq = [1, 2, 3, 4]
one_step_ahead = zip_longest(seq, seq[1:])
two_steps_ahead = zip_longest(seq, seq[2:])
# print(list(one_step_ahead))  # => [(1, 2), (2, 3), (3, 4), (4, None)]
# print(list(two_steps_ahead)) # => [(1, 3), (2, 4), (3, None), (4, None)]
merged = zip(one_step_ahead, two_steps_ahead)
print(list(merged))

此打印：

[((1, 2), (1, 3)), ((2, 3), (2, 4)), ((3, 4), (3, None)), ((4, None), (4, None))]

这与可调节性非常接近，我们在这里假定的唯一事情是我们只有两个zip对象要合并，在实际情况下，我们将有一个未知数，因此我们需要能够将merged = zip(one_step_ahead, two_steps_ahead)转换为列表大小未知的情况。为此，我们将简单地将所有“ x_steps_ahead”添加到列表中，将其命名为pairs，然后我们将使用扩展操作*pairs合并所有这些对。最后，它看起来像这样：

from itertools import zip_longest

seq = [1, 2, 3, 4]

pairs = []
for x in range(2):
  x_step_ahead = zip_longest(seq, seq[x:])
  pairs.append(x_step_ahead)

merged = zip(*pairs)
print(list(merged))

产生与以前相同的结果：

[((1, 2), (1, 3)), ((2, 3), (2, 4)), ((3, 4), (3, None)), ((4, None), (4, None))]

基本上，这就是我所提议的代码的全部思想。向后看的情况有点不寻常，但是我会让您了解它如何作为练习。最终代码中的细微差别还在于，我尽量避免实例化列表。首选迭代器/生成器，这使代码更难阅读，但在内存使用方面更加高效。

基本上，像pair结这样的东西会变成：

def pairs_generator():
  for x in range(2):
    yield zip_longest(seq, seq[x:])

pairs = pairs_generator()

这与之前的代码完全相同，只是避免在内存中存储大小为x的列表来记住我们创建的所有zips。

出于相同的原因，在下面的代码中，我还使用itertools.islice而不是经典切片，因为它的版本更浅（与slice不同，它没有实例化输入列表的副本）。 / p>

解决方案实施

from itertools import zip_longest, islice

def fill_with(iterator, value, times):
  """Add `value` `times` times in front of the iterator."""
  for _ in range(times):
    yield value
  yield from iterator

def pairs(seq, distance, reverse=False, empty=None):
  """Build lookup pairs from a list, for example: 
    list(pairs([1,2,3], 1)) => [(1, 2), (2, 3), (3, None)]
  and reverse make backward lookups: 
    list(pairs([1,2,3], 1, reverse=True)) => [(1, None), (2, 1), (3, 2)]
  """
  if reverse:
    return zip(seq, fill_with(seq, empty, distance))
  else:
    return zip_longest(seq, islice(seq, distance, None), fillvalue=empty)

def look_backward(seq, distance, empty=None):
  """Build look backward tuples, for example calling 
  list(look_backward([1,2,3], 2)) will produce: 
    [((1, None), (1, None)), ((2, None), (2, 1)), ((3, 2), (3, 1))]
  """
  return zip(*(pairs(seq, i, empty=empty, reverse=True) for i in range(distance,0, -1)))

def look_forward(seq, distance, empty=None):
  """Build look forward tuples, for example calling 
  list(look_forward([1,2,3], 2)) will produce: 
    [((1, 2), (1, 3)), ((2, 3), (2, None)), ((3, None), (3, None))]
  """
  return zip(*(pairs(seq, i+1, empty=empty) for i in range(distance)))

def pairs_lookahead(seq, behind=0, forward=3, empty=None):
  """Produce the results expected by https://stackoverflow.com/q/54847423/1720199"""
  backward_result = look_backward(seq, behind, empty=empty)
  forward_result = look_forward(seq, forward, empty=empty)
  if behind < 1 and forward > 0:
    return forward_result
  if behind > 0 and forward < 1:
    return backward_result
  return [a+b for a, b in zip(backward_result, forward_result)]

您可以按照建议的方式调用它：

seq = [1, 2, 3, 4]
result = pairs_lookahead(seq, behind=2, forward=1, empty="Z")
print(list(result))

result = pairs_lookahead(seq, behind=2, forward=0, empty="Y")
print(list(result))

result = pairs_lookahead(seq, behind=0, forward=1, empty="X")
print(list(result))

这将输出：

[((1, 'Z'), (1, 'Z'), (1, 2)), ((2, 'Z'), (2, 1), (2, 3)), ((3, 1), (3, 2), (3, 4)), ((4, 2), (4, 3), (4, 'Z'))]
[((1, 'Y'), (1, 'Y')), ((2, 'Y'), (2, 1)), ((3, 1), (3, 2)), ((4, 2), (4, 3))]
[((1, 2),), ((2, 3),), ((3, 4),), ((4, 'X'),)]

Answer 3

解决问题的一个有用的中间步骤是产生相邻值的序列。因此，如果输入为[1, 2, 3, 4, 5, ...]，则需要迭代并得到(1, 2, 3, 4)然后是(2, 3, 4, 5)，依此类推。

使用itertools.tee可以做到这一点。

import itertools

def n_wise(iterable, n):
    iterators = itertools.tee(iterable, n)
    for i, iterator in enumerate(iterators):
        next(itertools.islice(iterator, i, i), None)  # discard i values from the iterator
    return zip(*iterators)

现在，我们可以很容易地进行超前和后退（我暂时忽略了empty值）：

def lookaround(iterable, behind, ahead):
    for values in n_wise(iterable, 1 + behind + ahead):
        behind_values = values[:behind]
        current_value = values[behind]
        ahead_values = values[behind+1:]
        for b in behind_values:
            yield current_value, b
        for a in ahead_values:
            yield current_value, a

最简单的方法来适应该值以支持空值将只是填充可迭代对象。您首先需要behind额外的空值，最后需要ahead额外的值。

def lookaround_with_empties(iterable, behind, ahead, empty=None):
    padded_iterable = itertools.chain([empty]*behind, iterable, [empty]*ahead)
    return lookaround(padded_iterable, behind, ahead)

现在，当它连续回顾或向前看几个空值时，这有点奇怪（因为它为每个缺失值重复相同的输出），但是我不确定在这些情况下的期望。如果需要的话，可能有一种简单的方法来过滤输出以避免重复。

Answer 4

第一步，编写一个函数，该函数从pivot索引返回以pivot为中心的先行和先行对列表。

使用list comprehensions：

def around_pivot(seq, pivot, behind=2, forward=3):
    return [[seq[pivot], seq[pivot+i] if pivot+i < len(seq) and pivot+i >= 0 else None]
      for i in range(-behind, forward+1) if i != 0]

如果现在需要所有对，则只需再次应用列表推导，改变pivot：

all_pairs = [around_pivot(seq, i) for i in range(len(seq))]

或者，如果您更喜欢单行解决方案，但是请考虑代码的可读性/可维护性：

def all_pairs(seq, behind=2, forward=3):
    return [[[seq[p], seq[p+i] if p+i < len(seq) and p+i >= 0 else None]
      for i in range(-behind, forward+1) if i != 0] for p in range(len(seq))]

如果要优化内存使用量，可以改用generator comprehensions：

def all_pairs(seq, behind=2, forward=3):
    return (((seq[p], seq[p+i] if p+i < len(seq) and p+i >= 0 else None)
      for i in range(-behind, forward+1) if i != 0) for p in range(len(seq)))

实现针对先行对序列的生成器

4 个答案:

第一次重大更新：

第二次重大更新：

说明

解决方案实施