Question

我正在尝试编写一个HTTP服务，该服务将从HTTP中获取数据并使用Netty将其放入Kafka。我需要在m5.large EC2实例上处理20K RPS，这似乎很可行。

代码很简单：

Server.java

public class Server {
    public static void main(final String[] args) throws Exception {
        final EventLoopGroup bossGroup = new EpollEventLoopGroup();
        final EventLoopGroup workerGroup = new EpollEventLoopGroup();

        try {
            final ServerBootstrap bootstrap = new ServerBootstrap();

            bootstrap
                .group(bossGroup, workerGroup)
                .channel(EpollServerSocketChannel.class)
                .childHandler(new RequestChannelInitializer(createProducer()))
                .childOption(ChannelOption.SO_KEEPALIVE, true);
            bootstrap.bind(PORT).sync().channel().closeFuture().sync();
        } finally {
            bossGroup.shutdownGracefully();
            workerGroup.shutdownGracefully();
        }
    }

    private static Producer<String, ByteBuffer> createProducer() {
        final Properties properties = new Properties();

        properties.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, KAFKA_HOST);
        properties.put(ProducerConfig.CLIENT_ID_CONFIG, "KafkaBidRequestProducer");
        properties.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class.getName());
        properties.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, ByteBufferSerializer.class.getName());
        properties.put(ProducerConfig.RETRIES_CONFIG, 0);
        properties.put(ProducerConfig.REQUEST_TIMEOUT_MS_CONFIG, 10000);
        properties.put(ProducerConfig.MAX_BLOCK_MS_CONFIG, 10000);
        properties.put(ProducerConfig.SEND_BUFFER_CONFIG, 33554432);

        return new KafkaProducer<>(properties);
    }
}

RequestChannelInitializer.java

public class RequestChannelInitializer extends io.netty.channel.ChannelInitializer<SocketChannel> {
    private final Producer<String, ByteBuffer> producer;

    public BidRequestChannelInitializer(final Producer<String, ByteBuffer> producer) {
        this.producer = producer;
    }

    @Override
    public void initChannel(final SocketChannel ch) {
        ch.pipeline().addLast(new HttpServerCodec());
        ch.pipeline().addLast(new HttpObjectAggregator(1048576));
        ch.pipeline().addLast(new RequestHandler(producer));
    }
}

RequestHandler.java

public class RequestHandler extends SimpleChannelInboundHandler<FullHttpMessage> {
    private final Producer<String, ByteBuffer> producer;

    public BidRequestHandler(final Producer<String, ByteBuffer> producer) {
        this.producer = producer;
    }

    @Override
    public void channelReadComplete(final ChannelHandlerContext ctx) {
        ctx.flush();
    }

    @Override
    protected void channelRead0(ChannelHandlerContext ctx, FullHttpMessage msg) {
        final DefaultFullHttpResponse response = new DefaultFullHttpResponse(HTTP_1_1, OK);
        final ProducerRecord<String, ByteBuffer> record = new ProducerRecord<>(
            "test",
            UUID.randomUUID().toString(),
            msg.content().nioBuffer()
        );

        producer.send(record);

        if (HttpUtil.isKeepAlive(msg)) {
            response.headers().set(CONNECTION, HttpHeaderValues.KEEP_ALIVE);
        }

        ctx.write(response).addListener(ChannelFutureListener.CLOSE);
    }
}

该代码取自官方文档。但是，有时在负载测试中会出现Request 'Post BidRequest' failed: j.u.c.TimeoutException: Request timeout after 60000 ms个异常。

据我了解，这意味着我的负载测试实例和服务实例之间建立了连接，但是完成连接花费了60秒以上的时间。这个简单程序的哪一部分可以阻止这么长时间？

我已经调整了Kafka制作人：减少了超时时间。我知道send可能会被阻止，因此我增加了发送缓冲区，但这没有帮助。我还为服务用户增加了ulimits。我在OpenJDK版本1.8.0_171上运行，并且securerandom.source设置为file:/dev/urandom，所以对randomUUID的调用不应阻止。

Answer 1

您是对的，里面没有什么应该阻止。发送给Kafka的调用是异步的。我查看了您的代码，从我所看到的来看，一切看起来都不错。

我要检查的几件事：

确保AWS中的安全组定义允许Kafka服务器和该应用程序与Zookeeper进行通信。如果这是测试/ POC，则应只允许所有三个实例/群集之间的所有流量。 60秒的超时让我怀疑网络超时，这可能意味着某些服务无法访问。
您是否尝试过更简单的测试，尝试在没有Netty依赖的情况下生产Kafka？也许可以帮助缩小问题的范围。

净值请求超时

1 个答案: