批处理PubSub请求

时间:2018-03-02 14:09:38

标签: google-cloud-pubsub

用于批处理pubsub请求的NODEJS示例代码如下所示:

// Imports the Google Cloud client library
const PubSub = require(`@google-cloud/pubsub`);

// Creates a client
const pubsub = new PubSub();

/**
 * TODO(developer): Uncomment the following lines to run the sample.
 */
// const topicName = 'your-topic';
// const data = JSON.stringify({ foo: 'bar' });
// const maxMessages = 10;
// const maxWaitTime = 10000;

// Publishes the message as a string, e.g. "Hello, world!" or JSON.stringify(someObject)
const dataBuffer = Buffer.from(data);

pubsub
  .topic(topicName)
  .publisher({
    batching: {
      maxMessages: maxMessages,
      maxMilliseconds: maxWaitTime,
    },
  })
  .publish(dataBuffer)
  .then(results => {
    const messageId = results[0];
    console.log(`Message ${messageId} published.`);
  })
  .catch(err => {
    console.error('ERROR:', err);
  });

对我来说,目前尚不清楚如何使用此示例同时发布多条消息。有人可以解释如何调整此代码,以便它可以用于同时发布多个消息吗?

2 个答案:

答案 0 :(得分:8)

如果您要批量处理邮件,则需要保留发布者并多次调用publish。例如,您可以将代码更改为以下内容:

// Imports the Google Cloud client library
const PubSub = require(`@google-cloud/pubsub`);

// Creates a client
const pubsub = new PubSub();


const topicName = 'my-topic';
const maxMessages = 10;
const maxWaitTime = 10000;
const data1 = JSON.stringify({ foo: 'bar1' });
const data2 = JSON.stringify({ foo: 'bar2' });
const data3 = JSON.stringify({ foo: 'bar3' });

const publisher = pubsub.topic(topicName).publisher({
    batching: {
      maxMessages: maxMessages,
      maxMilliseconds: maxWaitTime,
    },
  })

function handleResult(p) {
  p.then(results => {
    console.log(`Message ${results} published.`);
  })
  .catch(err => {
    console.error('ERROR:', err);
  });
}

// Publish three messages
handleResult(publisher.publish(Buffer.from(data1)));
handleResult(publisher.publish(Buffer.from(data2)));
handleResult(publisher.publish(Buffer.from(data3)));

批量处理邮件由maxMessagesmaxMilliseconds属性处理。前者表示要包含在批处理中的最大消息数。后者表示等待发布批处理的最大毫秒数。这些属性与发布延迟的较大批次(可能更高效)进行权衡。如果您要快速发布许多消息,那么maxMilliseconds属性将不会产生太大影响;只要有10条消息准备就绪,客户端库就会向Cloud Pub / Sub服务发出发布请求。但是,如果发布是零星的或缓慢的,那么在有十条消息之前可能会发送一批消息。

在上面的示例代码中,我们在三条消息上调用publish。这不足以填满批次并发送。因此,在第一次调用publish后10,000毫秒,这三条消息将作为批处理发送到Cloud Pub / Sub。

答案 1 :(得分:0)

批处理说明:

  1. 如果要发布的邮件达到maxMessages指定的数量,则忽略maxMilliseconds选项,并立即批量发布等于maxMessages数量的邮件;

  2. 如果要发布的消息未达到maxMessages指定的数量,请等待maxMilliseconds时间后,批量发送这些消息

例如1:

async function publishMessage(topicName) {
  console.log(`[${new Date().toISOString()}] publishing messages`);
  const pubsub = new PubSub({ projectId: PUBSUB_PROJECT_ID });
  const topic = pubsub.topic(topicName, {
    batching: {
      maxMessages: 10,
      maxMilliseconds: 10 * 1000,
    },
  });

  const n = 12;
  const dataBufs: Buffer[] = [];
  for (let i = 0; i < n; i++) {
    const data = `message payload ${i}`;
    const dataBuffer = Buffer.from(data);
    dataBufs.push(dataBuffer);
  }

  const results = await Promise.all(
    dataBufs.map((dataBuf, idx) =>
      topic.publish(dataBuf).then((messageId) => {
        console.log(`[${new Date().toISOString()}] Message ${messageId} published. index: ${idx}`);
        return messageId;
      })
    )
  );
  console.log('results:', results.toString());
}

现在,我们将发布12条消息。执行结果:

[2020-05-05T09:09:41.847Z] publishing messages
[2020-05-05T09:09:41.955Z] Message 36832 published. index: 0
[2020-05-05T09:09:41.955Z] Message 36833 published. index: 1
[2020-05-05T09:09:41.955Z] Message 36834 published. index: 2
[2020-05-05T09:09:41.955Z] Message 36835 published. index: 3
[2020-05-05T09:09:41.955Z] Message 36836 published. index: 4
[2020-05-05T09:09:41.955Z] Message 36837 published. index: 5
[2020-05-05T09:09:41.955Z] Message 36838 published. index: 6
[2020-05-05T09:09:41.955Z] Message 36839 published. index: 7
[2020-05-05T09:09:41.955Z] Message 36840 published. index: 8
[2020-05-05T09:09:41.955Z] Message 36841 published. index: 9
[2020-05-05T09:09:51.939Z] Message 36842 published. index: 10
[2020-05-05T09:09:51.939Z] Message 36843 published. index: 11
results: 36832,36833,36834,36835,36836,36837,36838,36839,36840,36841,36842,36843

请注意时间戳记。前10条消息将立即发布,因为它们的数量由maxMessages指定。然后,因为其余2条消息未达到maxMessages指定的数量。因此pubsub将等待10秒(maxMilliseconds),然后发送其余2条消息。

例如2:

async function publishMessage(topicName) {
  console.log(`[${new Date().toISOString()}] publishing messages`);
  const pubsub = new PubSub({ projectId: PUBSUB_PROJECT_ID });
  const topic = pubsub.topic(topicName, {
    batching: {
      maxMessages: 10,
      maxMilliseconds: 10 * 1000,
    },
  });

  const n = 5;
  const dataBufs: Buffer[] = [];
  for (let i = 0; i < n; i++) {
    const data = `message payload ${i}`;
    const dataBuffer = Buffer.from(data);
    dataBufs.push(dataBuffer);
  }

  const results = await Promise.all(
    dataBufs.map((dataBuf, idx) =>
      topic.publish(dataBuf).then((messageId) => {
        console.log(`[${new Date().toISOString()}] Message ${messageId} published. index: ${idx}`);
        return messageId;
      })
    )
  );
  console.log('results:', results.toString());
}

现在,我们将发送5条消息,它们没有达到maxMessages指定的数量。因此pubsub将等待10秒(maxMilliseconds)。等待10秒钟(maxMilliseconds)后,pubsub将批量发送这5条消息。此方案与第一个示例中的其余2条消息相同。执行结果:

[2020-05-05T09:10:16.857Z] publishing messages
[2020-05-05T09:10:26.977Z] Message 36844 published. index: 0
[2020-05-05T09:10:26.977Z] Message 36845 published. index: 1
[2020-05-05T09:10:26.977Z] Message 36846 published. index: 2
[2020-05-05T09:10:26.977Z] Message 36847 published. index: 3
[2020-05-05T09:10:26.977Z] Message 36848 published. index: 4
results: 36844,36845,36846,36847,36848