是什么导致OpenSSL发生段错?

时间:2014-03-03 17:13:35

标签: c++ c linux ssl openssl

(gdb) bt
#0  0x040010c2 in ?? () from /lib/ld-linux.so.2
#1  0x06a14a0b in write () at ../sysdeps/unix/syscall-template.S:82
#2  0x04154ae9 in ?? () from /lib/i386-linux-gnu/libcrypto.so.1.0.0
#3  0x041518e4 in BIO_write () from /lib/i386-linux-gnu/libcrypto.so.1.0.0
#4  0x040781f1 in ?? () from /lib/i386-linux-gnu/libssl.so.1.0.0
#5  0x040785ff in ?? () from /lib/i386-linux-gnu/libssl.so.1.0.0
#6  0x04078855 in ?? () from /lib/i386-linux-gnu/libssl.so.1.0.0
#7  0x04075e28 in ?? () from /lib/i386-linux-gnu/libssl.so.1.0.0
#8  0x0408d709 in SSL_write () from /lib/i386-linux-gnu/libssl.so.1.0.0
#9  0x0409c451 in ?? () from /lib/i386-linux-gnu/libssl.so.1.0.0
#10 0x041518e4 in BIO_write () from /lib/i386-linux-gnu/libcrypto.so.1.0.0
#11 0x0814b10f in SSL_Connection_send (connection=0x9ffbbd0) 
...

(gdb) print *connection->bio
$1 = {method = 0x40ac800, callback = 0, cb_arg = 0x0, init = 1, shutdown = 1, flags = 0, retry_reason = 0, num = 0, ptr = 0xa27e768, next_bio = 0x7b84ad0, prev_bio = 0x0, references = 1, num_read = 904, 
  num_write = 2870, ex_data = {sk = 0x0, dummy = 774321733}}

(gdb) print *connection->ssl
$2 = {version = 769, type = 4096, method = 0x40aacc0, rbio = 0x7b84ad0, wbio = 0x7b84ad0, bbio = 0x0, rwstate = 2, in_handshake = 0, handshake_func = 0x40738f0, server = 0, new_session = 0, quiet_shutdown = 0, 
  shutdown = 0, state = 3, rstate = 240, init_buf = 0x0, init_msg = 0xf5838e4, init_num = 0, init_off = 0, 
  packet = 0xf848ebb "\027\003\001\001\330HTTP/1.1 200 OK\r\nAccess-Control-Allow-Origin: *\r\nCache-Control: private, no-cache, no-store, must-revalidate\r\nContent-Type: text/javascript; charset=UTF-8\r\nETag: \"cacca674ed49d64124f812372ad59561"..., packet_length = 0, s2 = 0x0, s3 = 0xf3ea410, d1 = 0x0, read_ahead = 0, msg_callback = 0, msg_callback_arg = 0x0, hit = 0, param = 0x99f82b8, cipher_list = 0x0, 
  cipher_list_by_id = 0x0, mac_flags = 0, enc_read_ctx = 0xace5438, read_hash = 0xabce1c8, expand = 0x0, enc_write_ctx = 0x9794468, write_hash = 0xc057018, compress = 0x0, cert = 0xe1f70d8, sid_ctx_length = 0, 
  sid_ctx = '\000' , session = 0x7d54760, generate_session_id = 0, verify_mode = 0, verify_callback = 0, info_callback = 0, error = 0, error_code = 0, psk_client_callback = 0, 
  psk_server_callback = 0, ctx = 0xae8ce30, debug = 0, verify_result = 20, ex_data = {sk = 0x0, dummy = 0}, client_CA = 0x0, references = 1, options = 4, mode = 4, max_cert_list = 102400, first_packet = 0, 
  client_version = 769, max_send_fragment = 16384, tlsext_debug_cb = 0, tlsext_debug_arg = 0x0, tlsext_hostname = 0x0, servername_done = 0, tlsext_status_type = -1, tlsext_status_expected = 0, 
  tlsext_ocsp_ids = 0x0, tlsext_ocsp_exts = 0x0, tlsext_ocsp_resp = 0x0, tlsext_ocsp_resplen = -1, tlsext_ticket_expected = 1, tlsext_ecpointformatlist_length = 3, tlsext_ecpointformatlist = 0xac40bc8 "", 
  tlsext_ellipticcurvelist_length = 50, tlsext_ellipticcurvelist = 0x7b20878 "", tlsext_opaque_prf_input = 0x0, tlsext_opaque_prf_input_len = 0, tlsext_session_ticket = 0x0, tls_session_ticket_ext_cb = 0, 
  tls_session_ticket_ext_cb_arg = 0x0, tls_session_secret_cb = 0, tls_session_secret_cb_arg = 0x0, initial_ctx = 0xae8ce30, next_proto_negotiated = 0x982fd50 "Groups.History", 
  next_proto_negotiated_len = 111 'o', srtp_profiles = 0x7373656d, srtp_profile = 0x2e656761, tlsext_heartbeat = 1953720648, tlsext_hb_pending = 0, tlsext_hb_seq = 424, renegotiate = 1232, srp_ctx = {
    SRP_cb_arg = 0x0, TLS_ext_srp_username_callback = 0x6f697463, SRP_verify_param_callback = 0x64695f6e, SRP_give_srp_client_pwd_callback = 0x4c4f202c, login = 0x72742e44 "", N = 0x498, g = 0x0, s = 0xa, 
    B = 0xfc0301, A = 0xfc03, a = 0x0, b = 0x0, v = 0x0, info = 0x0, strength = 0, srp_Mask = 0}}

我想弄清楚为什么这是错误的。当我向服务器发出请求时,错误随机发生(仅在Linux上)。可能是远程端关闭了连接(connection-> bio-> shutdown == 1)?或者它是一个内存错误(next_proto_negotiated = 0x982fd50“Groups.History”......看起来不像协议,虽然valgrind没有收到任何内存错误)。

3 个答案:

答案 0 :(得分:4)

您是否正在进行任何多线程编程,同时访问不同线程上的套接字?如果是这样,请确保您没有在一个线程上写入它并在另一个线程上将其关闭。我之前犯了这个错误,看到崩溃就像这样。需要一些恼人的重新架构才能在一个线程上获得所有SSL调用。

答案 1 :(得分:0)

如果由于多线程而遇到问题,那么在两次调用SSL_shutdown()之间暂停一两秒可能会有所帮助。即使在单线程程序中,下面的文章也是如此。

http://pic.dhe.ibm.com/infocenter/tpfhelp/current/index.jsp?topic=%2Fcom.ibm.ztpf-ztpfdf.doc_put.cur%2Fgtps5%2Fs5sple2.html

答案 2 :(得分:0)

如果服务器在设置TLS上下文时忽略调用SSL_CTX_set_session_id_context,我在Linux segfault上看到了openssl。

这导致崩溃的一个很好的迹象是,如果较新的TLS客户端(例如Windows Server 2012)尝试重新连接TLS票证。您将看到的是每个其他连接尝试导致段错误,因为在良好的连接之后,客户端使用TLS票证返回第二个连接。

即使您没有使用TLS票证或会话缓存,也可以查看是否添加了对SSL_CTX_set_session_id_context的调用来解决问题。