以下信息可能是什么原因:
BUG:在CPU#0,sh / 11786上怀疑螺旋锁死锁
lock:kmap_lock + 0x0 / 0x40,.magic:dead4ead,.owner:sh / 11787,.owner_cpu:1
答案 0 :(得分:3)
块引用 BUG:在CPU#0,sh / 11786上怀疑螺旋锁死锁
这表示CPU0是锁定的,并且线程/进程将是sh(或者由sh开始,我不确定)。您应该查看内核转储的堆栈strace信息。例如:
127|uid=0 gid=1007@nutshell:/var # [ 172.285647] BUG: spinlock lockup on CPU#0, swapper/0, 983482f0
[ 172.291523] [<8003cb44>] (unwind_backtrace+0x0/0xf8) from [<801853e4>] (do_raw_spin_lock+0x100/0x164)
[ 172.300768] [<801853e4>] (do_raw_spin_lock+0x100/0x164) from [<80350508>] (_raw_spin_lock_irqsave+0x54/0x60)
[ 172.310618] [<80350508>] (_raw_spin_lock_irqsave+0x54/0x60) from [<7f3cf4a0>] (mlb_os81092_interrupt+0x18/0x68 [os81092])
[ 172.321636] [<7f3cf4a0>] (mlb_os81092_interrupt+0x18/0x68 [os81092]) from [<800abee0>] (handle_irq_event_percpu+0x50/0x184)
[ 172.332781] [<800abee0>] (handle_irq_event_percpu+0x50/0x184) from [<800ac050>] (handle_irq_event+0x3c/0x5c)
[ 172.342622] [<800ac050>] (handle_irq_event+0x3c/0x5c) from [<800ae00c>] (handle_level_irq+0xac/0xfc)
[ 172.351767] [<800ae00c>] (handle_level_irq+0xac/0xfc) from [<800ab82c>] (generic_handle_irq+0x2c/0x40)
[ 172.361090] [<800ab82c>] (generic_handle_irq+0x2c/0x40) from [<800552e8>] (mx3_gpio_irq_handler+0x78/0x140)
[ 172.370843] [<800552e8>] (mx3_gpio_irq_handler+0x78/0x140) from [<800ab82c>] (generic_handle_irq+0x2c/0x40)
[ 172.380595] [<800ab82c>] (generic_handle_irq+0x2c/0x40) from [<80036904>] (handle_IRQ+0x4c/0xac)
[ 172.389402] [<80036904>] (handle_IRQ+0x4c/0xac) from [<80035ad0>] (__irq_svc+0x50/0xd0)
[ 172.397416] [<80035ad0>] (__irq_svc+0x50/0xd0) from [<80036bb4>] (default_idle+0x28/0x2c)
[ 172.405603] [<80036bb4>] (default_idle+0x28/0x2c) from [<80036e9c>] (cpu_idle+0x9c/0x108)
[ 172.413793] [<80036e9c>] (cpu_idle+0x9c/0x108) from [<800088b4>] (start_kernel+0x294/0x2e4)
[ 172.422181] [<800088b4>] (start_kernel+0x294/0x2e4) from [<10008040>] (0x10008040)
[1]这会告诉你函数调用关系。注意信息:
[172.310618] [&lt; 80350508&gt;](_raw_spin_lock_irqsave + 0x54 / 0x60)来自[&lt; 7f3cf4a0&gt;](mlb_os81092_interrupt + 0x18 / 0x68 [os81092])
这告诉mlb_os81092_interrupt
函数尝试使用spin_lock_irqsave
锁定某些内容。所以我们可以发现这个自旋锁用于锁定什么,并尝试分析或记录以检测哪一个持有锁。然后找到了避免它的方法。
[2]另外因为CPU0被锁定,并且可以有MP系统,你应该确定是否有一个irq可以使用关键资源,如果irq的处理程序被分配给其他CPU(比如CPU1),没关系,但是如果CPU0处理irq的处理程序,如果你使用spin_lock而不是spin_lock_irqsave,这会导致死锁,所以检查一下。