我如何批量使用一个迭代器(大小相等的块)?

时间:2019-01-25 16:31:21

标签: javascript ecmascript-6 generator

我经常在Python中使用batch()。自ES6以来,JavaScript中是否有其他替代方法,它具有迭代器和生成器功能?

2 个答案:

答案 0 :(得分:3)

我必须为自己写一个,我要在这里分享给我,其他人可以在这里轻松找到:

// subsequently yield iterators of given `size`
// these have to be fully consumed
function* batches(iterable, size) {
  const it = iterable[Symbol.iterator]();
  while (true) {
    // this is for the case when batch ends at the end of iterable
    // (we don't want to yield empty batch)
    let {value, done} = it.next();
    if (done) return value;

    yield function*() {
      yield value;
      for (let curr = 1; curr < size; curr++) {
        ({value, done} = it.next());
        if (done) return;

        yield value;
      }
    }();
    if (done) return value;
  }
}

它产生生成器,而不是Array。您必须完全消耗每个批次,然后才能再次调用next()

答案 1 :(得分:1)

来到这里是想看看其他人的建议。这是我在看这篇文章之前最初用 TypeScript 编写的版本。

async function* batch<T>(iterable: AsyncIterableIterator<T>, batchSize: number) {
  let items: T[] = [];
  for await (const item of iterable) {
    items.push(item);
    if (items.length >= batchSize) {
      yield items;
      items = []
    }
  }
  if (items.length !== 0) {
    yield items;
  }
}

这允许您分批使用可迭代对象,如下所示。

async function doYourThing<T>(iterable: AsyncIterableIterator<T>) {
  const itemsPerBatch = 5
  const batchedIterable = batch<T>(iterable, itemsPerBatch)
  for await (const items of batchedIterable) {
    await someOperation(items)
  }
}

就我而言,这让我可以更轻松地在 Mongo 中使用 bulkOps,如下所示。

import { MongoClient, ObjectID } from 'mongodb';
import { batch } from './batch';

const config = {
  mongoUri: 'mongodb://localhost:27017/test?replicaSet=rs0',
};

interface Doc {
  readonly _id: ObjectID;
  readonly test: number;
}

async function main() {
  const client = await MongoClient.connect(config.mongoUri);
  const db = client.db('test');
  const coll = db.collection<Doc>('test');
  await coll.deleteMany({});
  console.log('Deleted test docs');

  const testDocs = new Array(4).fill(null).map(() => ({ test: 1 }));
  await coll.insertMany(testDocs);
  console.log('Inserted test docs');

  const cursor = coll.find().batchSize(5);
  for await (const docs of batch<Doc>(cursor as any, 5)) {
    const bulkOp = coll.initializeUnorderedBulkOp();
    docs.forEach((doc) => {
      bulkOp.find({ _id: doc._id }).updateOne({ test: 2 });
    });
    console.log('Updating', docs.length, 'test docs');
    await bulkOp.execute();
  }
  console.log('Updated test docs');
}

main()
  .catch(console.error)
  .then(() => process.exit());