Question

这是代码的一部分，应该以1000条记录的块形式运行记录搜索：

  for subrange, batch in batched(records, size=1000):
      print("Processing records %d-%d" %
        (subrange[0], subrange[-1]))
      process(batch)

我需要为其编写一个yield生成器函数，到目前为止，我已经尝试过这样：

def batched(records, chunk_size=1000):
    """Lazy function (generator) to read records piece by piece.
    Default chunk size: 1k."""
    while True:
        data = records.read(chunk_size)
        if not data:
            break
        yield data

问题陈述如下：

For optimal performance, records should be processed in batches.
Create a generator function "batched" that will yield batches of 1000
records at a time

我也不太确定如何测试该功能，所以有什么想法吗？

PS = batched生成器函数应该位于给定的subrange循环之前。

Answer 1

您给定的循环代码

for subrange, batch in batched(records, size=1000):
    print("Processing records %d-%d" %
      (subrange[0], subrange[-1]))
    process(batch)

对batched()有隐式要求：

它应该返回一个可迭代的。确实可以通过生成器功能来实现。
产生的项目应为元组subrange, batch。该子范围似乎是所有元素的索引列表，只是开始和结束索引的列表或元组，或者是range()对象。我假设是后者。

A，我们对给出的records对象一无所知。如果它具有read()函数，则可以调整您的方法：

def batched(records, size=1000):
    """Generator function to read records piece by piece.
    Default chunk size: 1k."""
    index = 0
    while True:
        data = records.read(size)
        if not data:
            break
        yield range(index, index + len(data)), data
        index += len(data)

但是如果records只是应该细分的列表，您可以这样做

def batched(records, size=1000):
    """Generator function to read records piece by piece.
    Default chunk size: 1k."""
    index = 0
    while True:
        data = records[index:index + size]
        if not data:
            break
        yield range(index, index + len(data)), data
        index += len(data)

Answer 2

class CustomApi(Api):
    def handle_error(self, e):
        # Do something and then abort
        flask_restful.abort('some error code...', 'error message')

api = CustomApi(app, errors=errors)

根据给定任务编写产量生成器函数

2 个答案: