我有一个内核模块,用于分割传入的rtp数据包和合并rtp传出数据包。程序在2/3天内崩溃一次。如果可以找到模块崩溃的确切行,对我来说会非常方便。 我在下面给出了崩溃转储。是否可以从崩溃转储中找到代码中的确切行?
PID: 1256 TASK: ffff88020fc71700 CPU: 0 COMMAND: "rtpproxy"
#0 [ffff880212faf2f0] machine_kexec at ffffffff8103bb7a
#1 [ffff880212faf360] crash_kexec at ffffffff810bb968
#2 [ffff880212faf430] oops_end at ffffffff8169fad8
#3 [ffff880212faf460] die at ffffffff81017808
#4 [ffff880212faf490] do_general_protection at ffffffff8169f5d2
#5 [ffff880212faf4c0] general_protection at ffffffff8169eef5
[exception RIP: pkt_queue+388]
RIP: ffffffffa00f3fa0 RSP: ffff880212faf578 RFLAGS: 00010292
RAX: ffff8802110ae400 RBX: ffff880213a53f38 RCX: 00015d910000a20f
RDX: 497d74565cede60c RSI: 000000006df1ed57 RDI: 00000000e46e0cfc
RBP: ffff880212faf728 R8: ffff880211a8b000 R9: ffff880212fafa60
R10: ffff880212fafbc8 R11: 0000000000000293 R12: 00000000134ab2b4
R13: 000000008386615c R14: 00000000000000e3 R15: 00000000000000e3
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
#6 [ffff880212faf730] obsf_tg at ffffffffa00f34a0 [xt_OBSF]
#7 [ffff880212faf890] ipt_do_table at ffffffffa00e41a5 [ip_tables]
#8 [ffff880212faf970] ipt_mangle_out at ffffffffa00dd129 [iptable_mangle]
#9 [ffff880212faf9c0] iptable_mangle_hook at ffffffffa00dd1eb [iptable_mangle]
#10 [ffff880212faf9d0] nf_iterate at ffffffff815aded5
#11 [ffff880212fafa20] nf_hook_slow at ffffffff815adf85
#12 [ffff880212fafaa0] __ip_local_out at ffffffff815babb2
#13 [ffff880212fafac0] ip_local_out at ffffffff815babd6
#14 [ffff880212fafae0] ip_send_skb at ffffffff815bbefb
#15 [ffff880212fafb00] udp_send_skb at ffffffff815df1d1
#16 [ffff880212fafb50] udp_sendmsg at ffffffff815e0286
#17 [ffff880212fafc90] inet_sendmsg at ffffffff815eabc4
#18 [ffff880212fafcd0] sock_sendmsg at ffffffff8156a437
#19 [ffff880212fafe50] sys_sendto at ffffffff8156d91d
#20 [ffff880212faff80] system_call_fastpath at ffffffff816a7029
RIP: 00007f17363b83a3 RSP: 00007ffff2965f90 RFLAGS: 00010213
RAX: 000000000000002c RBX: ffffffff816a7029 RCX: 00007ffff29ff99b
RDX: 0000000000000020 RSI: 00007f1737da4378 RDI: 0000000000000006
RBP: 0000000000000001 R8: 00007f1737da67a0 R9: 0000000000000010
R10: 0000000000000000 R11: 0000000000000293 R12: 00007f1737da4378
R13: 0000000000000001 R14: 00007f1737da42a0 R15: 0000000000000000
ORIG_RAX: 000000000000002c CS: 0033 SS: 002b
[157707.736203] general protection fault: 0000 [#1] SMP
[157707.736955] CPU 0
[157707.736973] Modules linked in:
[157707.737654] arc4 xt_tcpudp xt_OBSF(O) iptable_mangle ip_tables x_tables ghash_clmulni_intel aesni_intel cryptd aes_x86_64 joydev hid_generic microcode ext2 usbhid psmouse hid serio_raw i2c_piix4 virtio_balloon lp parport mac_hid floppy
[157707.740018]
[157707.740102] Pid: 1256, comm: rtpproxy Tainted: G O 3.5.0-23-generic #35~precise1-Ubuntu Bochs Bochs
[157707.740102] RIP: 0010:[<ffffffffa00f3fa0>] [<ffffffffa00f3fa0>] pkt_queue+0x184/0x48a [xt_OBSF]
[157707.740102] RSP: 0018:ffff880212faf578 EFLAGS: 00010292
[157707.740102] RAX: ffff8802110ae400 RBX: ffff880213a53f38 RCX: 00015d910000a20f
[157707.740102] RDX: 497d74565cede60c RSI: 000000006df1ed57 RDI: 00000000e46e0cfc
[157707.740102] RBP: ffff880212faf728 R08: ffff880211a8b000 R09: ffff880212fafa60
[157707.740102] R10: ffff880212fafbc8 R11: 0000000000000293 R12: 00000000134ab2b4
[157707.740102] R13: 000000008386615c R14: 00000000000000e3 R15: 00000000000000e3
[157707.740102] FS: 00007f1736ad9700(0000) GS:ffff88021fc00000(0000) knlGS:0000000000000000
[157707.740102] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[157707.740102] CR2: 00007fd8a39f8000 CR3: 0000000211ad7000 CR4: 00000000000407f0
[157707.740102] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[157707.740102] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[157707.740102] Process rtpproxy (pid: 1256, threadinfo ffff880212fae000, task ffff88020fc71700)
[157707.740102] Stack:
[157707.740102] ffff880212faf5a8 0000000000015d91 134ab2b400000008 000008f58386615c
[157707.740102] 00015d910000a20f a080527800000014 3a78560000d1fa00 564812de1a006045
[157707.740102] ffff880212faf618 ffffffff81872e20 0000000000000000 ffff880210ca9000
[157707.740102] Call Trace:
[157707.740102] [<ffffffff8169e7de>] ? _raw_spin_lock+0xe/0x20
[157707.740102] [<ffffffff815a0958>] ? sch_direct_xmit+0x88/0x1c0
[157707.740102] [<ffffffff81090833>] ? update_cpu_power+0x63/0x100
[157707.740102] [<ffffffff810909c3>] ? update_group_power+0xf3/0x100
[157707.740102] [<ffffffff81090db2>] ? update_sd_lb_stats+0x3e2/0x5f0
[157707.740102] [<ffffffffa00f34a0>] obsf_tg+0x9c0/0x133c [xt_OBSF]
[157707.740102] [<ffffffff81090ff9>] ? find_busiest_group+0x39/0x4a0
[157707.740102] [<ffffffff81091541>] ? load_balance+0xe1/0x4a0
[157707.740102] [<ffffffffa00e41a5>] ipt_do_table+0x315/0x450 [ip_tables]
[157707.740102] [<ffffffffa00dd129>] ipt_mangle_out+0x99/0x100 [iptable_mangle]
[157707.740102] [<ffffffffa00dd1eb>] iptable_mangle_hook+0x5b/0x60 [iptable_mangle]
[157707.740102] [<ffffffff815aded5>] nf_iterate+0x85/0xc0
[157707.740102] [<ffffffff815b8e50>] ? ip_forward_options+0x200/0x200
[157707.740102] [<ffffffff815adf85>] nf_hook_slow+0x75/0x150
[157707.740102] [<ffffffff815b8e50>] ? ip_forward_options+0x200/0x200
[157707.740102] [<ffffffff815babb2>] __ip_local_out+0xa2/0xb0
[157707.740102] [<ffffffff815babd6>] ip_local_out+0x16/0x30
[157707.740102] [<ffffffff815bbefb>] ip_send_skb+0x1b/0x50
[157707.740102] [<ffffffff815df1d1>] udp_send_skb+0x111/0x2a0
[157707.740102] [<ffffffff815b9070>] ? ip_setup_cork+0x150/0x150
[157707.740102] [<ffffffff815e0286>] udp_sendmsg+0x316/0x960
[157707.740102] [<ffffffff815eabc4>] inet_sendmsg+0x64/0xb0
[157707.740102] [<ffffffff812f31b7>] ? apparmor_socket_sendmsg+0x17/0x20
[157707.740102] [<ffffffff8156a437>] sock_sendmsg+0x117/0x130
[157707.740102] [<ffffffff8119a510>] ? __pollwait+0xf0/0xf0
[157707.740102] [<ffffffff8119a510>] ? __pollwait+0xf0/0xf0
[157707.740102] [<ffffffff8119a510>] ? __pollwait+0xf0/0xf0
[157707.740102] [<ffffffff8156b58d>] ? move_addr_to_user+0xbd/0xd0
[157707.740102] [<ffffffff8156ce7a>] ? move_addr_to_kernel+0x5a/0xa0
[157707.740102] [<ffffffff8156d91d>] sys_sendto+0x13d/0x190
[157707.740102] [<ffffffff8103fcc9>] ? kvm_clock_read+0x19/0x20
[157707.740102] [<ffffffff8103fcd9>] ? kvm_clock_get_cycles+0x9/0x10
[157707.740102] [<ffffffff810a3bd7>] ? getnstimeofday+0x57/0xe0
[157707.740102] [<ffffffff810a3cca>] ? do_gettimeofday+0x1a/0x50
[157707.740102] [<ffffffff816a7029>] system_call_fastpath+0x16/0x1b
[157707.740102] Code: f7 f1 48 8b 8d 70 fe ff ff 4c 63 f2 41 89 d7 49 69 c6 68 01 00 00 48 01 c3 48 8b 83 58 01 00 00 48 2d 58 01 00 00 48 89 c2 eb 20 <44> 39 62 04 0f 85 c0 02 00 00 44 39 6a 08 0f 85 b6 02 00 00 48
[157707.740102] RIP [<ffffffffa00f3fa0>] pkt_queue+0x184/0x48a [xt_OBSF]
[157707.740102] RSP <ffff880212faf578>
答案 0 :(得分:23)
[157707.736203] general protection fault: 0000 [#1] SMP
说你在内存中做了一些可怕的事情,可能是解除引用空指针。
[157707.740102] RIP: 0010:[<ffffffffa00f3fa0>] [<ffffffffa00f3fa0>] pkt_queue+0x184/0x48a
当您的模块崩溃时,此行会向您报告指令指针值;它说它在偏移“0x184”之后死于一个名为“pkt_queue”的函数内。 (顺便说一下,第一个崩溃转储中出现相同的值,十进制中的388 = 0x184)
现在,您可以使用objdump
转储有关代码的程序集+调试信息,并将函数pkt_queue
的地址添加到0x184
,然后进入违规指令。
假设您的pkt_queue函数在objdump中的地址0x01
处出现(不合理地假设),这意味着您应该在程序集中查看行:0x184 + 0x01
= 0x185
以查看正在发生的事情。
Objdump允许您查看源+组件和行号:
objdump -S your_object_file.o
这不仅会列出程序集,还会列出相应的源代码,假设在编译时添加了调试符号。
哦,以供将来参考:
https://opensourceforu.com/2011/01/understanding-a-kernel-oops/
答案 1 :(得分:0)
您还可以使用:
eu-addr2line -f -e object_file.o pkt_queue + 0x184
答案 2 :(得分:0)
还有脚本script / decode_stacktrace.sh in the kernel source code.
You should enable
CONFIG_DEBUG_INFO`,然后运行脚本:
./scripts/decode_stacktrace.sh /path/to/vmlinux /path/to/kernel/tree /path/to/modules/dir < dmesg.log
例如从内核源代码根目录开始:
make O=~/kbuild/x86/ -j9
cd ~/kbuild/x86/
make INSTALL_MOD_PATH=~/modpath modules_install
cd -
./scripts/decode_stacktrace.sh ~/kbuild/x86/vmlinux . ~/modpath < crash.log