Boost Socket在close()上令人belly目结舌

时间:2018-10-02 21:51:41

标签: c++ sockets boost

我们有一个与服务器通信的C ++应用程序。它向它发送两条消息,服务器用另一条消息响应每个消息。我们使用的是Boost,但是当我们尝试关闭套接字时,Boost Socket(整个应用程序)会阻塞。

这是我们正在做的一般想法:

  1. 编码消息(将其更改为字符串)
  2. 打开套接字
  3. 发送消息
  4. 检查发送的字节数
  5. 检查退货消息
  6. 关闭并关闭插座

由于我们发送了两条消息,因此我们将其循环执行(显然只有两次迭代)。

我们确切知道错误的出处,因为如果删除该行,它将正常工作。它位于第5步。不幸的是,这是很重要的一步。我们找不到错误的解决方法。

代码如下:

bool ReallyImportantService::sendMessages( int messageNum ) {

    // ...some error-checking here...

    bool successCode = false;
    for( int i = 0; i < 2; ++i ) {

        successCode = false;

        unique_ptr<boost::asio::ip::tcp::socket> theSocket = connect();

        if( theSocket == nullptr ) {
            theLogger->error( "Could not create socket, could not send input messageNum to service" );
            return successCode;
        }

        string message = encodeMessage( messageNum );

        // send the message
        boost::system::error_code error;
        size_t bytesSent = boost::asio::write(*theSocket,
                                       boost::asio::buffer(message),
                                       boost::asio::transfer_all(), error);

        // inspect the result
        if( !messageNumSendSuccessful(message.length(), bytesSent) ) {
            return successCode;
        }

        // Get the response message
        string response;
        boost::system::error_code e;
        boost::asio::streambuf buffer;

        // this is step #5 above, the line that kills it. But it responds with no errors
        boost::asio::read_until(*theSocket, buffer, "\0", e);

        if( e.value() == boost::system::errc::success ) {
            istream str(&buffer);
            getline(str, response);

            // validate response
            successCode = messageAckIsValid( response, messageNum );
        }
        else {
            theLogger->error( "Got erroneous response from server when sending messageNum" );
        }

        // close it all up
        boost::system::error_code eShut;
        theSocket->shutdown(boost::asio::socket_base::shutdown_type::shutdown_both, eShut);
        // We never get an error code here, all clean

        try {
            boost::system::error_code ec;

            // This is where it all goes belly-up. It doesn't throw an exception, doesn't return an 
            // error-code. Stepping through, we can see the call stack shows a Segmentation fault, 
            // but we don't know what could be causing this.
            theSocket->close( ec );
        }
        catch(boost::system::system_error& se) {
            theLogger->error( "sendMessages() barfed on close! " + string(se.what()) );
        }
        catch( ... ) {
            theLogger->error( "sendMessages() barfed on close! " );
        }
    }
    return successCode;
}

string ReallyImportantService::encodeMessage( int messageNum ) {

    // Encode the message
    stringstream ss;
    ss << "^FINE=";
    ss << to_string(messageNum) << "\n";
    string message = ss.str();

    theLogger->info( message );

    return message;
}

unique_ptr<boost::asio::ip::tcp::socket> ReallyImportantService::connect() {
    // Addresses from configuration
    string address( server_ip );
    string port( server_port );

    // Resolve the IP address
    boost::asio::io_service ioService;
    boost::asio::ip::tcp::resolver resolver(ioService);
    boost::asio::ip::tcp::resolver::query query(address, port);
    boost::asio::ip::tcp::resolver::iterator ep_iterator = resolver.resolve(query);

    // create the socket
    unique_ptr<boost::asio::ip::tcp::socket> theSocket = make_unique<boost::asio::ip::tcp::socket>(ioService);

    // not sure if this is necessary, but couldn't hurt; we do reuse the IP address the second time around
    boost::system::error_code ec;
    theSocket->set_option(boost::asio::socket_base::reuse_address(true), ec);

    // Connect
    try {

        boost::asio::connect(*theSocket, ep_iterator);

    } catch(const boost::system::system_error &e){
        theSocket = nullptr;
        theLogger->error( "Exception while attempting to create socket: " + string(e.what()) );
    } catch(const exception &e){
        theSocket = nullptr;
        theLogger->error( "Exception while attempting to create socket: " + string(e.what()) );
    }

    return theSocket;
}

这是发生错误时我们得到的调用堆栈:

(Suspended : Signal : SIGSEGV:Segmentation fault)   
    pthread_mutex_lock() at 0x7ffff7bc8c30  
    boost::asio::detail::posix_mutex::lock() at posix_mutex.hpp:52 0x969072 
    boost::asio::detail::scoped_lock<boost::asio::detail::posix_mutex>::scoped_lock() at scoped_lock.hpp:36 0x980b66    
    boost::asio::detail::epoll_reactor::free_descriptor_state() at epoll_reactor.ipp:517 0x96c6fa   
    boost::asio::detail::epoll_reactor::deregister_descriptor() at epoll_reactor.ipp:338 0x96bccc   
    boost::asio::detail::reactive_socket_service_base::close() at reactive_socket_service_base.ipp:103 0xb920aa 
    boost::asio::stream_socket_service<boost::asio::ip::tcp>::close() at stream_socket_service.hpp:151 0xb975e0 
    boost::asio::basic_socket<boost::asio::ip::tcp, boost::asio::stream_socket_service<boost::asio::ip::tcp> >::close() at basic_socket.hpp:339 0xb94f0d    
    ReallyImportantService::sendMessages() at ReallyImportantService.cc:116 0xb8ce19    
    <...more frames...> 

我们创建了一个最小的实现,只是:

  1. 创建套接字
  2. 关闭插座
  3. 关闭插座

它运行完美。我们将其循环放置,可以进行数十次迭代而不会出现任何问题。

我们正在使用Eclipse CDT和gcc进行编译。

有什么想法吗?

1 个答案:

答案 0 :(得分:1)

您违反了基本规则。

io_service的寿命必须超过在其上创建的所有对象。

您的connect()函数创建一个io_service,在其上创建一个套接字并返回该套接字(包装在unique_ptr中)。然后io_service被销毁。

从那时起,所有赌注都消失了,因为套接字将使用与刚销毁的io_service关联的套接字服务对象。现在,此套接字服务只是其中包含未定义值的内存。您很不幸,程序在段错误之前就已经实现了。

通常,每个应用程序需要一个io_service。所有需要它的对象都应带有对它的引用。

您的连接功能将变为:

bool connect(boost::asio::ip::tcp& theSocket) {
    // Addresses from configuration
    string address( server_ip );
    string port( server_port );

    // Resolve the IP address
    boost::asio::ip::tcp::resolver resolver(theSocket.get_io_service());
    boost::asio::ip::tcp::resolver::query query(address, port);
    boost::asio::ip::tcp::resolver::iterator ep_iterator = resolver.resolve(query);

    // not sure if this is necessary, but couldn't hurt; we do reuse the IP address the second time around
    boost::system::error_code ec;
    theSocket.set_option(boost::asio::socket_base::reuse_address(true), ec);

    // Connect
    try {

        boost::asio::connect(theSocket, ep_iterator);

    } catch(const boost::system::system_error &e){
        theSocket = nullptr;
        theLogger->error( "Exception while attempting to create socket: " + string(e.what()) );
        return false;
    } catch(const exception &e){
        theSocket = nullptr;
        theLogger->error( "Exception while attempting to create socket: " + string(e.what()) );
        return false;
    }

    return true;
}

bool sendMessages(boost::asio::io_service& ios, int messageNum)
{
    boost::asio::ip::tcp::socket theSocket(ios);
    auto ok = connect(theSocket);

    // ... carry on ...

}
  • 尽可能地保留对套接字等的引用。将它们包装在unique_ptr中是一种令人困惑的间接层。

  • 从c ++ 11和最新版本的boost开始,asio套接字是可移动的。您可以按值返回它们,而不必像我那样传递引用。

  • 我注意到您在代码中混合了异常和非异常错误处理。您可能希望坚持使用另一种方法(在我看来,基于异常的错误处理比较干净,但这不是通用视图)。