从循环中断后使用生成器继续循环

时间:2019-06-23 01:57:36

标签: python python-3.x

我试图在一个用例中使用生成器,其中我们必须跟踪字符串流中的k个“最大”元素。我想要做的是将元素添加到列表中,直到它们到达元素k,然后进行堆化,然后使用元素逐一维护堆从那里继续流。我对使用发电机有点陌生,因此感谢您的帮助

def my_generator(stream):
    for string in stream:
        yield string
def top_k(k,stream):
    count = 0
    min_heap = []
    for string in stream:

            if count >= k:
                break
            min_heap.append((len(string),string))
            count += 1
            print(min_heap)

    heapq.heapify(min_heap)

    for string in stream:
        heapq.heappushpop(min_heap,(len(string),string))

    return  heapq.nsmallest(k,min_heap)

strings = ["This", "whatis", "going", "in"]
stream = my_generator(strings)
output = top_k(2,stream)
print(output)

1 个答案:

答案 0 :(得分:2)

您的断点和随后的流恢复会导致元素“丢失”到空白处。

这是您的代码,但又不会丢失任何元素:

def top_k(k, stream):
    min_heap = []

    # loop over k instead of stream
    for _ in range(k):
        string = next(stream) # get the next item
        min_heap.append((len(string), string))
        print(min_heap) # debug

    heapq.heapify(min_heap)

    # here we finish all of what's left in stream
    for string in stream:
        heapq.heappushpop(min_heap, (len(string), string))

    return heapq.nsmallest(k, min_heap)