librdkafka程序退出没有错误

时间:2018-03-19 01:16:41

标签: c++ multithreading c++11 apache-kafka c++14

我有一个在主线程上运行的Producer和一个在自己的线程(std :: thread)上运行的Consumer。我有一个简单的程序,使用Producer发送消息,然后在尝试发送另一条消息之前将主线程置于休眠状态。

每当我的主线程进入休眠状态时,程序就会存在。没有例外。当我尝试正确停止并删除我的消费者/制作人时,会发生同样的事情。很明显,我做错了什么,但我不知道是什么,因为我没有从我的程序中得到任何错误。我看到的最后一条日志消息是在主线程进入休眠状态之前我打印的消息。

我已将try-catch放在我的Consumer线程的main和内部。我还调用了std :: set_terminate并在那里添加了日志记录。当我的程序退出try-catch时,终止捕获任何内容。

有什么建议吗?

更新#1 [来源]

正如Sid S所指出的,我错过了明显的消息来源。

main.cc

int main(int argc, char** argv) {
  std::cout << "% Main started." << std::endl;

  std::set_terminate([](){
    std::cerr << "% Terminate occurred in main." << std::endl;
    abort();
  });

  try {
    using com::anya::core::networking::KafkaMessenger;
    using com::anya::core::common::MessengerCode;
    KafkaMessenger messenger;

    auto promise = std::promise<bool>();
    auto future = promise.get_future();
    messenger.Connect([&promise](MessengerCode code, std::string& message) {
      promise.set_value(true);
    });
    future.get();

    std::cout << "% Main connection successful." << std::endl;

    // Produce 5 messages 5 seconds apart.
    int number_of_messages_sent = 0;
    while (number_of_messages_sent < 5) {
      std::stringstream message;
      message << "message-" << number_of_messages_sent;

      auto message_send_promise = std::promise<bool>();
      auto message_send_future = message_send_promise.get_future();
      messenger.SendMessage(message.str(), [&message_send_promise](MessengerCode code) {
        std::cout << "% Main message sent" << std::endl;
        message_send_promise.set_value(true);
      });
      message_send_future.get();

      number_of_messages_sent++;
      std::cout << "% Main going to sleep for 5 seconds." << std::endl;
      std::this_thread::sleep_for(std::chrono::seconds(5));
    }

    // Disconnect from Kafka and cleanup.
    auto disconnect_promise = std::promise<bool>();
    auto disconnect_future = disconnect_promise.get_future();
    messenger.Disconnect([&disconnect_promise](MessengerCode code, std::string& message) {
      disconnect_promise.set_value(true);
    });
    disconnect_future.get();
    std::cout << "% Main disconnect complete." << std::endl;
  } catch (std::exception& exception) {
    std::cerr << "% Exception caught in main with error: " << exception.what() << std::endl;
    exit(1);
  }

  std::cout << "% Main exited." << std::endl;
  exit(0);
}

KafkaMessenger.cc [消费者部分]

void KafkaMessenger::Connect(std::function<void(MessengerCode , std::string&)> impl) {
  assert(!running_.load());
  running_.store(true);

  // For the sake of brevity I've removed a whole bunch of Kafka configuration setup from the sample code.

  RdKafka::ErrorCode consumer_response = consumer_->start(topic_for_consumer, 0, RdKafka::Topic::OFFSET_BEGINNING);

  if (consumer_response != RdKafka::ERR_NO_ERROR) {
    running_.store(false);
    delete consumer_;
    delete producer_;

    error = RdKafka::err2str(consumer_response);
    impl(MessengerCode::CONNECT_FAILED, error);
  }

  auto consumer_thread_started_promise = std::promise<bool>();
  auto consumer_thread_started_future = consumer_thread_started_promise.get_future();
  consumer_thread_ = std::thread([this, &topic_for_consumer, &consumer_thread_started_promise]() {
    try {
      std::cout << "% Consumer thread started." << std ::endl;
      consumer_thread_started_promise.set_value(true);

      while (running_.load()) {
        RdKafka::Message* message = consumer_->consume(topic_for_consumer, 0, 5000);

        switch (message->err()) {
          case RdKafka::ERR_NO_ERROR: {
            std::string message_string((char*) message->payload());
            std::cout << "% Consumer received message: " << message_string << std::endl;
            delete message;
            break;
          }
          default:
            std::cerr << "% Consumer consumption failed: " << message->errstr() << " error code=" << message->err() << std::endl;
            break;
        }
      }

      std::cout << "% Consumer shutting down." << std::endl;
      if (consumer_->stop(topic_for_consumer, 0) != RdKafka::ERR_NO_ERROR) {
        std::cerr << "% Consumer error while trying to stop." << std::endl;
      }
    } catch (std::exception& exception) {
      std::cerr << "% Caught exception in consumer thread: " << exception.what() << std::endl;
    }
  });

  consumer_thread_started_future.get();
  std::string message("Consumer connected");
  impl(MessengerCode::CONNECT_SUCCESS, message);
}

KafkaMessenger.cc [制片人部分]

void KafkaMessenger::SendMessage(std::string message, std::function<void(MessengerCode)> impl) {
  assert(running_.load());
  std::cout << "% Producer sending message." << std::endl;

  RdKafka::ErrorCode producer_response = producer_->produce(
      producer_topic_,
      RdKafka::Topic::PARTITION_UA,
      RdKafka::Producer::RK_MSG_COPY,
      static_cast<void*>(&message), message.length(), nullptr, nullptr);

  switch (producer_response) {
    case RdKafka::ERR_NO_ERROR: {
      std::cout << "% Producer Successfully sent (" << message.length() << " bytes)" << std::endl;
      impl(MessengerCode::MESSAGE_SEND_SUCCESS);
      break;
    }
    case RdKafka::ERR__QUEUE_FULL: {
      std::cerr << "% Sending message failed: " << RdKafka::err2str(producer_response) << std::endl;
      impl(MessengerCode::MESSAGE_SEND_FAILED);
      break;
    }
    case RdKafka::ERR__UNKNOWN_PARTITION: {
      std::cerr << "% Sending message failed: " << RdKafka::err2str(producer_response) << std::endl;
      impl(MessengerCode::MESSAGE_SEND_FAILED);
      break;
    }
    case RdKafka::ERR__UNKNOWN_TOPIC: {
      std::cerr << "% Sending message failed: " << RdKafka::err2str(producer_response) << std::endl;
      impl(MessengerCode::MESSAGE_SEND_FAILED);
      break;
    }
    default: {
      std::cerr << "% Sending message failed: " << RdKafka::err2str(producer_response) << std::endl;
      impl(MessengerCode::MESSAGE_SEND_FAILED);
      break;
    }
  }
}

输出 当我运行main方法时,这是我在控制台中看到的输出。

% Main started.
% Consumer thread started.
% Main connection successful.
% Producer sending message.
% Producer Successfully sent (9 bytes)
% Main message sent
% Main going to sleep for 5 seconds.
% Consumer received message: message-

仔细检查后,我不认为睡眠是导致这种情况的原因,因为当我取消睡眠时,这仍然会发生。正如您在上一个日志行中所看到的,Consumer会打印出截断的最后一个字符所收到的消息。有效负载应读取消息-0。所以某处某事正在消亡。

更新#2 [堆栈跟踪]

我遇到了这个旧的但非常有用的post关于捕获信号并打印出堆栈的信息。我实现了这个解决方案,现在我可以看到有关事情崩溃的更多信息。

Error: signal 11:
0   main                                0x00000001012e4eec _ZN3com4anya4core10networking7handlerEi + 28
1   libsystem_platform.dylib            0x00007fff60511f5a _sigtramp + 26
2   ???                                 0x0000000000000000 0x0 + 0
3   main                                0x00000001012f2866 rd_kafka_poll_cb + 838
4   main                                0x0000000101315fee rd_kafka_q_serve + 590
5   main                                0x00000001012f5d46 rd_kafka_flush + 182
6   main                                0x00000001012e7f1a _ZN3com4anya4core10networking14KafkaMessenger10DisconnectENSt3__18functionIFvNS1_6common13MessengerCodeENS4_12basic_stringIcNS4_11char_traitsIcEENS4_9allocatorIcEEEEEEE + 218
7   main                                0x00000001012dbc45 main + 3221
8   libdyld.dylib                       0x00007fff60290115 start + 1
9   ???                                 0x0000000000000001 0x0 + 1

作为我的关闭方法的一部分,我调用producer _-&gt; flush(1000),这会导致生成的堆栈跟踪。如果我将其删除,那么关机就可以了。很明显,当我尝试冲洗时,我错误地配置了导致这个seg-fault的东西。

更新#3 [解决方案]

事实证明,处理Kafka事件和传递报告记录的我的类是作用于方法的。这是一个问题,因为librdkafka库通过引用获取这些,所以当我的主流程方法退出并且清理开始时这些对象消失了。我将记录器限定在类级别,这解决了崩溃问题。

1 个答案:

答案 0 :(得分:1)

Kafka消息有效负载只是二进制数据,除非你发送一个带有尾随nul字节的字符串,否则它将不包含这样的nul-byte,这会导致你的std :: string构造函数读入邻近的内存中寻找nul,可能访问未映射的内存,这将导致您的应用程序崩溃,或者至少使您的终端更加坚固。

将消息长度与有效负载结合使用以构造一个限制为实际字节数的std :: string,它仍然不安全打印,但它是一个开始:

std::string message_string((char*) message->payload(), message->len());