如果不使用特定的迭代器,将导致类似zip的函数失败

时间:2018-11-11 08:48:08

标签: python python-3.x iterator iterable

我想要一个类似zip的函数,如果不使用最右边的迭代器,该函数将失败。它应该屈服直到失败。

例如

>>> a = ['a', 'b', 'c']
>>> b = [1, 2, 3, 4]

>>> myzip(a, b)
Traceback (most recent call last):
    ...
ValueError: rightmost iterable was not consumed

>>> list(myzip(b, a))
[(1, 'a'), (2, 'b'), (3, 'c')]

也许标准库中有一个函数可以帮助您解决此问题?

重要提示:

在实际情况下,迭代器不会遍历对象,因此我不能仅检查其长度或对其进行索引。

编辑:

这是我到目前为止提出的

def myzip(*iterables):
    iters = [iter(i) for i in iterables]

    zipped = zip(*iters)

    try:
        next(iters[-1])
        raise ValueError('rightmost iterable was not consumed')
    except StopIteration:
        return zipped

这是最好的解决方案吗?它不会保留迭代器的状态,因为我接下来要对其进行调用,这可能是一个问题。

4 个答案:

答案 0 :(得分:1)

您可以通过几种不同的方式来做到这一点。

  1. 您可以将普通的zip()与迭代器一起使用,并手动检查其是否耗尽。

    def check_consumed(it):
        try:
            next(it)
        except StopIteration:
            pass
        else:
            raise ValueError('rightmost iterable was not consumed')
    
    b_it = iter(b)
    list(zip(a, b_it))
    check_consumed(b_it)
    
  2. 您可以包装普通的zip()来为您做检查。

    def myzip(a, b):
        b_it = iter(b)
        yield from zip(a, b_it)
        # Or, if you're on a Python version that doesn't have yield from:
        #for item in zip(a, b_it):
        #    yield item
        check_consumed(b_it)
    
    list(myzip(a, b))
    
  3. 您可以使用zip()iter()从头开始编写自己的next()

    (此代码没有代码,因为选项2在各个方面都优于此代码)

答案 1 :(得分:0)

使用itertools中的zip_longest的其他选项。如果所有列表都被消耗,则返回true或false。也许不是最有效的方法,但可以改进:

from itertools import zip_longest
a = ['a', 'b', 'c', 'd']
b = [1, 2, 3, 4, 5]
c = ['aa', 'bb', 'cc', 'dd', 'ee', 'ff']

def myzip(*iterables):
    consumed = True
    zips = []
    for zipped in zip_longest(*iterables):
      if None in zipped:
        consumed = False 
      else:
        zips.append(zipped)
    return [zips, consumed]


list(myzip(a, b, c))
#=> [[('a', 1, 'aa'), ('b', 2, 'bb'), ('c', 3, 'cc'), ('d', 4, 'dd')], False]

答案 2 :(得分:0)

我认为这是通过检查最后一个消费者在返回之前是否已完全消耗掉来完成工作的

# Example copied from https://stackoverflow.com/questions/19151/build-a-basic-python-iterator
class Counter:
    def __init__(self, low, high):
        self.current = low
        self.high = high

    def __iter__(self):
        return self

    def __next__(self): # Python 3: def __next__(self)
        if self.current > self.high:
            raise StopIteration
        else:
            self.current += 1
            return self.current - 1

# modified from https://docs.python.org/3.5/library/functions.html#zip
def myzip(*iterables):
    sentinel = object()
    iterators = [iter(it) for it in iterables]
    while iterators:
        result = []
        for it in iterators:
            elem = next(it, sentinel)
            if elem is sentinel:
                elem = next(iterators[-1], sentinel)
                if elem is not sentinel:
                    raise ValueError("rightmost iterable was not consumed")
                else:
                    return
            result.append(elem)
        yield tuple(result)


a = Counter(1,7)
b = range(9)

for val in myzip(a,b):
    print(val)

答案 3 :(得分:0)

itertools中已经有一个zip_longest,可以通过默认值“扩展”较短的可迭代项。

使用它并检查是否出现默认值:如果是,则为"rightmost element not consumed"

class MyError(ValueError):
    """Unique "default" value that is recognizeable and allows None to be in your values.""" 
    pass

from itertools import zip_longest

isMyError = lambda x:isinstance(x,MyError)

def myzip(a,b):
    """Raises MyError if any non-consumed elements would occur using default zip()."""
    K = zip_longest(a,b, fillvalue=MyError())
    if all(not isMyError(t) for q in K for t in q): 
        return zip(a,b)
    raise MyError("Not all items are consumed") 


a = ['a', 'b', 'c', 'd']
b = [1, 2, 3, 4]
f = myzip(a, b)
print(list(f)) 
try:
    a = ['a', 'b', ]
    b = [1, 2, 3, 4]
    f = myzip(a, b)
    print(list(f)) 
except MyError as e:
    print(e)

输出:

[('a', 1), ('b', 2), ('c', 3), ('d', 4)]
Not all items are consumed

这会消耗(最坏的情况)一次完整的压缩列表以进行检查,然后将其返回为可迭代状态。