如何解释内核恐慌?

时间:2011-02-28 07:22:59

标签: c linux debugging linux-kernel

我是linux内核的新手,几乎无法理解如何调试内核恐慌。我在下面有这个错误,我不知道在C代码中我应该从哪里开始检查。我想也许我可以回应正在调用的函数,所以我可以检查这个空指针取消引用的特定函数的位置/。我应该使用什么打印功能?你如何解释下面的错误信息?

Unable to handle kernel NULL pointer dereference at virtual address 0000000d
pgd = c7bdc000
[0000000d] *pgd=4785f031, *pte=00000000, *ppte=00000000
Internal error: Oops: 17 [#1] PREEMPT
Modules linked in: bcm5892_secdom_fw(P) bcm5892_lcd snd_bcm5892 msr bcm5892_sci bcm589x_ohci_p12 bcm5892_skeypad hx_decoder(P) pinnacle hx_memalloc(P) bcm_udc_dwc scsi_mod g_serial sd_mod usb_storage
CPU: 0    Tainted: P           (2.6.27.39-WR3.0.2ax_standard #1)
PC is at __kmalloc+0x70/0xdc
LR is at __kmalloc+0x48/0xdc
pc : [c0098cc8]    lr : [c0098ca0]    psr: 20000093
sp : c7a9fd50  ip : c03a4378  fp : c7a9fd7c
r10: bf0708b4  r9 : c7a9e000  r8 : 00000040
r7 : bf06d03c  r6 : 00000020  r5 : a0000093  r4 : 0000000d
r3 : 00000000  r2 : 00000094  r1 : 00000020  r0 : c03a4378
Flags: nzCv  IRQs off  FIQs on  Mode SVC_32  ISA ARM  Segment user
Control: 00c5387d  Table: 47bdc008  DAC: 00000015
Process sh (pid: 1088, stack limit = 0xc7a9e260)
Stack: (0xc7a9fd50 to 0xc7aa0000)
fd40:                                     c7a6a1d0 00000020 c7a9fd7c c7ba8fc0
fd60: 00000040 c7a6a1d0 00000020 c71598c0 c7a9fd9c c7a9fd80 bf06d03c c0098c64
fd80: c71598c0 00000003 c7a6a1d0 bf06c83c c7a9fdbc c7a9fda0 bf06d098 bf06d008
fda0: c7159880 00000000 c7a6a2d8 c7159898 c7a9fde4 c7a9fdc0 bf06d130 bf06d078
fdc0: c79ca000 c7159880 00000000 00000000 c7afbc00 c7a9e000 c7a9fe0c c7a9fde8
fde0: bf06d4b4 bf06d0f0 00000000 c79fd280 00000000 0f700000 c7a9e000 00000241
fe00: c7a9fe3c c7a9fe10 c01c37b4 bf06d300 00000000 c7afbc00 00000000 00000000
fe20: c79cba84 c7463c78 c79fd280 c7473b00 c7a9fe6c c7a9fe40 c00a184c c01c35e4
fe40: 00000000 c7bb0005 c7a9fe64 c79fd280 c7463c78 00000000 c00a1640 c785e380
fe60: c7a9fe94 c7a9fe70 c009c438 c00a164c c79fd280 c7a9fed8 c7a9fed8 00000003
fe80: 00000242 00000000 c7a9feb4 c7a9fe98 c009c614 c009c2a4 00000000 c7a9fed8
fea0: c7a9fed8 00000000 c7a9ff64 c7a9feb8 c00aa6bc c009c5e8 00000242 000001b6
fec0: 000001b6 00000241 00000022 00000000 00000000 c7a9fee0 c785e380 c7473b00
fee0: d8666b0d 00000006 c7bb0005 00000300 00000000 00000000 00000001 40002000
ff00: c7a9ff70 c79b10a0 c79b10a0 00005402 00000003 c78d69c0 ffffff9c 00000242
ff20: 000001b6 c79fd280 c7a9ff64 c7a9ff38 c785e380 c7473b00 00000000 00000241
ff40: 000001b6 ffffff9c 00000003 c7bb0000 c7a9e000 00000000 c7a9ff94 c7a9ff68
ff60: c009c128 c00aa380 4d18b5f0 08000000 00000000 00071214 0007128c 00071214
ff80: 00000005 c0027ee4 c7a9ffa4 c7a9ff98 c009c274 c009c0d8 00000000 c7a9ffa8
ffa0: c0027d40 c009c25c 00071214 0007128c 0007128c 00000241 000001b6 00000000
ffc0: 00071214 0007128c 00071214 00000005 00073580 00000003 000713e0 400010d0
ffe0: 00000001 bef0c7b8 000269cc 4d214fec 60000010 0007128c 00000000 00000000
Backtrace:
[<c0098c58>] (__kmalloc+0x0/0xdc) from [<bf06d03c>] (gs_alloc_req+0x40/0x70 [g_serial]) r8:c71598c0 r7:00000020 r6:c7a6a1d0 r5:00000040 r4:c7ba8fc0
[<bf06cffc>] (gs_alloc_req+0x0/0x70 [g_serial]) from [<bf06d098>] (gs_alloc_requests+0x2c/0x78 [g_serial]) r7:bf06c83c r6:c7a6a1d0 r5:00000003 r4:c71598c0
[<bf06d06c>] (gs_alloc_requests+0x0/0x78 [g_serial]) from [<bf06d130>] (gs_start_io+0x4c/0xac [g_serial]) r7:c7159898 r6:c7a6a2d8 r5:00000000 r4:c7159880
[<bf06d0e4>] (gs_start_io+0x0/0xac [g_serial]) from [<bf06d4b4>] (gs_open+0x1c0/0x224 [g_serial]) r9:c7a9e000 r8:c7afbc00 r7:00000000 r6:00000000 r5:c7159880 r4:c79ca000
[<bf06d2f4>] (gs_open+0x0/0x224 [g_serial]) from [<c01c37b4>] (tty_open+0x1dc/0x314)
[<c01c35d8>] (tty_open+0x0/0x314) from [<c00a184c>] (chrdev_open+0x20c/0x22c)
[<c00a1640>] (chrdev_open+0x0/0x22c) from [<c009c438>] (__dentry_open+0x1a0/0x2b8) r8:c785e380 r7:c00a1640 r6:00000000 r5:c7463c78 r4:c79fd280
[<c009c298>] (__dentry_open+0x0/0x2b8) from [<c009c614>] (nameidata_to_filp+0x38/0x50)
[<c009c5dc>] (nameidata_to_filp+0x0/0x50) from [<c00aa6bc>] (do_filp_open+0x348/0x6f4) r4:00000000
[<c00aa374>] (do_filp_open+0x0/0x6f4) from [<c009c128>] (do_sys_open+0x5c/0x170)
[<c009c0cc>] (do_sys_open+0x0/0x170) from [<c009c274>] (sys_open+0x24/0x28) r8:c0027ee4 r7:00000005 r6:00071214 r5:0007128c r4:00071214
[<c009c250>] (sys_open+0x0/0x28) from [<c0027d40>] (ret_fast_syscall+0x0/0x2c)
Code: e59c4080 e59c8090 e3540000 159c308c (17943103)
---[ end trace be196e7cee3cb1c9 ]---
note: sh[1088] exited with preempt_count 2
process '-/bin/sh' (pid 1088) exited. Scheduling for restart.

Welcome to Wind River Linux

3 个答案:

答案 0 :(得分:7)

由于错误发生在kmalloc(回溯中的最顶行),这可能意味着内存分配器失败。常见原因包括双重释放内存或以其他方式破坏内存分配器结构。

由于回溯中的前一帧都属于模块g_serial,我可能会怀疑这是罪魁祸首。

我建议获取内核和模块的调试符号,以便能够在回溯中显示源代码文件和行。这将有很大帮助。

答案 1 :(得分:3)

内存分配器错误经常出现在随机位置,因为插入(释放)分配系统中的内存块的代码是错误的。接下来试图分配该块的任何人都会触发错误,尽管它本身并没有错。

他们刚刚在LWN http://lwn.net/Articles/260068/谈论kmemcheck ......这可能会有所帮助。否则,检查自己的代码,仔细检查每个内存分配和释放,仔细研究gcc -Wall的输出并寻找未定义的指针。

答案 2 :(得分:1)

我认为你在寻找printk。您可能想看看kernel debugging techniques的这篇介绍。