根据给定任务编写产量生成器函数

时间:2018-12-17 16:15:03

标签: python yield

这是代码的一部分,应该以1000条记录的块形式运行记录搜索:

  for subrange, batch in batched(records, size=1000):
      print("Processing records %d-%d" %
        (subrange[0], subrange[-1]))
      process(batch)

我需要为其编写一个yield生成器函数,到目前为止,我已经尝试过这样:

def batched(records, chunk_size=1000):
    """Lazy function (generator) to read records piece by piece.
    Default chunk size: 1k."""
    while True:
        data = records.read(chunk_size)
        if not data:
            break
        yield data

问题陈述如下:

For optimal performance, records should be processed in batches.
Create a generator function "batched" that will yield batches of 1000
records at a time 

我也不太确定如何测试该功能,所以有什么想法吗?

PS = batched生成器函数应该位于给定的subrange循环之前。

2 个答案:

答案 0 :(得分:2)

您给定的循环代码

for subrange, batch in batched(records, size=1000):
    print("Processing records %d-%d" %
      (subrange[0], subrange[-1]))
    process(batch)

batched()有隐式要求:

  1. 它应该返回一个可迭代的。确实可以通过生成器功能来实现。
  2. 产生的项目应为元组subrange, batch。该子范围似乎是所有元素的索引列表,只是开始和结束索引的列表或元组,或者是range()对象。我假设是后者。

A,我们对给出的records对象一无所知。如果它具有read()函数,则可以调整您的方法:

def batched(records, size=1000):
    """Generator function to read records piece by piece.
    Default chunk size: 1k."""
    index = 0
    while True:
        data = records.read(size)
        if not data:
            break
        yield range(index, index + len(data)), data
        index += len(data)

但是如果records只是应该细分的列表,您可以这样做

def batched(records, size=1000):
    """Generator function to read records piece by piece.
    Default chunk size: 1k."""
    index = 0
    while True:
        data = records[index:index + size]
        if not data:
            break
        yield range(index, index + len(data)), data
        index += len(data)

答案 1 :(得分:1)

class CustomApi(Api):
    def handle_error(self, e):
        # Do something and then abort
        flask_restful.abort('some error code...', 'error message')

api = CustomApi(app, errors=errors)