Question

我正在使用Java客户端库从AppEngine Flexible环境发布Pubsub消息，如下所示：

Publisher publisher = Publisher
                .newBuilder(ProjectTopicName.of(Utils.getApplicationId(), "test-topic"))
                .setBatchingSettings(
                        BatchingSettings.newBuilder()
                                .setIsEnabled(false)
                                .build())
                .build();

publisher.publish(PubsubMessage.newBuilder()
                .setData(ByteString.copyFromUtf8(message))
                .putAttributes("timestamp", String.valueOf(System.currentTimeMillis()))
                .build());

我正在订阅Dataflow中的主题，并记录消息从AppEngine flexible到达Dataflow所需的时间

pipeline
            .apply(PubsubIO.readMessagesWithAttributes().fromSubscription(Utils.buildPubsubSubscription(Constants.PROJECT_NAME, "test-topic")))
            .apply(ParDo.of(new DoFn<PubsubMessage, PubsubMessage>() {
                @ProcessElement
                public void processElement(ProcessContext c) {
                    long timestamp = System.currentTimeMillis() - Long.parseLong(c.element().getAttribute("timestamp"));
                    System.out.println("Time: " + timestamp);
                }
            }));
    pipeline.run();

当我以每秒几条消息的速度发布消息时，日志显示消息到达数据流所需的时间在100毫秒至1.5秒之间。但是，当速率大约为每秒100条消息时，时间通常在100ms-200ms之间，这似乎是足够的。有人可以解释这种行为。似乎关闭发布者批处理无效。

Answer 1

Pub / Sub设计用于两种订阅情况下的高吞吐量消息。

当消息量很大时，拉式订阅最有效，这是您优先考虑消息处理吞吐量时使用的订阅类型。特别要注意的是，同步拉取不会在消息发布后立即对其进行处理，而是可以选择拉并处理固定数量的消息（更多消息，更多拉取）。更好的选择是使用异步请求，该请求使用长时间运行的消息侦听器并一次确认一条消息[1]。

另一方面，Push订阅使用慢启动算法：每次成功传递后，发送的消息数都会增加一倍，直到达到其约束（更多的消息，更多的传递和更快的传递）为止。

[1] https://cloud.google.com/pubsub/docs/pull#asynchronous-pull

GCP Pubsub低消息/秒时的高延迟

1 个答案: