unwind_frame导致内核分页错误

时间:2013-08-28 05:14:34

标签: c linux linux-kernel

背景

发现一个奇怪的内核哎呀,谷歌搜索了很多,什么都没发现 背景:

  • 内核版本为3.0.8

  • 有两个过程让我们说p1,p2

  • p2有很多主题(约30个)

  • p1不断呼叫系统(pidof(“名称为p1”))

内核可能会在运行几天后哎呀。我找到的主要原因是unwind_frame得到了一个奇怪的框架 - >来自get_wchan的fp(0xFFFFFFFF) 执行此行时

frame->fp = *(unsigned long *)(fp - 12);

CPU将尝试访问0xFFFFFFF3,并导致分页错误。

我的问题是:

在上下文切换之前保存的fp寄存器如何变为0xFFFFFFFF?


这是CPU信息

# cat /proc/cpuinfo 
Processor       : ARMv7 Processor rev 0 (v7l)
processor       : 0
BogoMIPS        : 1849.75

processor       : 1
BogoMIPS        : 1856.30

Features        : swp half thumb fastmult vfp edsp vfpv3 vfpv3d16 
CPU implementer : 0x41
CPU architecture: 7
CPU variant     : 0x3
CPU part        : 0xc09
CPU revision    : 0

这是Oops和pt寄存器:

[734212.113136] Unable to handle kernel paging request at virtual address fffffff3
[734212.113154] pgd = 826f0000
[734212.113175] [fffffff3] *pgd=8cdfe821, *pte=00000000, *ppte=00000000
[734212.113199] Internal error: Oops: 17 [#1] SMP
--------------cut--------------    
[734212.113464] CPU: 1    Tainted: P             (3.0.8 #2)
[734212.113523] PC is at unwind_frame+0x48/0x68
[734212.113538] LR is at get_wchan+0x8c/0x298
[734212.113557] pc : [<8003d120>]    lr : [<8003a660>]    psr: a0000013
[734212.113561] sp : 845d1cc8  ip : 00000003  fp : 845d1cd4
[734212.113583] r10: 00000001  r9 : 00000000  r8 : 80493c34
[734212.113597] r7 : 00000000  r6 : 00000000  r5 : 83354960  r4 : 845d1cd8
[734212.113613] r3 : 845d1cd8  r2 : ffffffff  r1 : 80490000  r0 : 8049003f
[734212.113632] Flags: NzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
[734212.113651] Control: 10c53c7d  Table: 826f004a  DAC: 00000015

这是callstack:

[734212.117027] Backtrace:
[734212.117052] [<8003d0d8>] (unwind_frame+0x0/0x68) from [<8003a660>] (get_wchan+0x8c/0x298)
[734212.117079] [<8003a5d4>] (get_wchan+0x0/0x298) from [<8011f700>] (do_task_stat+0x548/0x5ec)
[734212.117099]  r4:00000000
[734212.117118] [<8011f1b8>] (do_task_stat+0x0/0x5ec) from [<8011f7c0>] (proc_tgid_stat+0x1c/0x24)
[734212.117158] [<8011f7a4>] (proc_tgid_stat+0x0/0x24) from [<8011b7f0>] (proc_single_show+0x54/0x98)
[734212.117196] [<8011b79c>] (proc_single_show+0x0/0x98) from [<800e9024>] (seq_read+0x1b4/0x4e4)
[734212.117215]  r8:845d1f08 r7:845d1f70 r6:00000001 r5:8ca89d20 r4:866ea540
[734212.117237] r3:00000000
[734212.117264] [<800e8e70>] (seq_read+0x0/0x4e4) from [<800c8c54>] (vfs_read+0xb4/0x19c)
[734212.117289] [<800c8ba0>] (vfs_read+0x0/0x19c) from [<800c8e18>] (sys_read+0x44/0x74)
[734212.117307]  r8:00000000 r7:00000003 r6:000003ff r5:7ea00818 r4:8ca89d20
[734212.117340] [<800c8dd4>] (sys_read+0x0/0x74) from [<800393c0>] (ret_fast_syscall+0x0/0x30)
[734212.117358]  r9:845d0000 r8:80039568 r6:7ea00c90 r5:0000000e r4:7ea00818
[734212.117388] Code: e3c10d7f e3c0103f e151000c 9afffff6 (e512100c)
[734212.113136] Unable to handle kernel paging request at virtual address fffffff3
[734212.113154] pgd = 826f0000
[734212.113175] [fffffff3] *pgd=8cdfe821, *pte=00000000, *ppte=00000000
[734212.113199] Internal error: Oops: 17 [#1] SMP

1 个答案:

答案 0 :(得分:1)

此错误由Konstantin Khlebnikov修复,详情可在git commit log找到。