对导致内核分页错误请求的atomic_set操作进行故障诊断

时间:2014-12-22 07:34:51

标签: c linux linux-kernel

我正在尝试解决崩溃我正在编程的Linux内核驱动程序的错误。代码在另一台机器上工作正常。我已经将相同的代码迁移到另一台机器,现在它崩溃了。我很难排除问题究竟是什么。我已将其缩小到以下代码块。

mutex_lock(&(ctl->mtx));
atomic_set(&ctl->app.enabled, 1); // this line crashes it
mutex_unlock(&(ctl->mtx));

我也尝试了以下内容。

mutex_lock(&(ctl->mtx));
atomic_set(&ctl->app.enabled, 0x01);
mutex_lock(&(ctl->mtx));

以下是涉及的结构。

typedef struct ctl_struct {
    struct mutex mtx;
    struct app_struct app;
}

struct app_struct {
    atomic_t enabled;
}

这两行代码产生以下内容

BUG: unable to handle kernel paging request at 0000000000002c28
IP: [<ffffffffa02cd847>] mod_start_trace+0x157/0x1a0 [test_mod]
Dec 22 14:02:42 test_server kernel: [41114.399186] PGD 7ef858067 PUD 7f06d2067 PMD 0
Dec 22 14:02:42 test_server kernel: [41114.399830] Oops: 0002 [#1] SMP
Dec 22 14:02:42 test_server kernel: [41114.400006] Modules linked in: test_mod     ipt_MASQUERADE iptable_nat nf_nat_ipv4 xt_CHECKSUM iptable_mangle bridge stp llc ebtable_nat ebtables ip6t_REJECT xt_hl ip6t_rt nf_conntrack_ipv6 nf_defrag_ipv6 ipt_REJECT xt_LOG xt_limit xt_tcpudp xt_addrtype nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack xfs ip6table_filter ip6_tables nf_conntrack_netbios_ns nf_conntrack_broadcast nf_nat_ftp nf_nat nf_conntrack_ftp nf_conntrack iptable_filter ip_tables x_tables libcrc32c gpio_ich coretemp ast joydev lpc_ich ttm drm_kms_helper drm i5000_edac serio_raw edac_core i2c_algo_bit syscopyarea i5k_amb sysfillrect sysimgblt shpchp lp parport tpm_infineon mac_hid ipmi_si raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq raid1 mptsas hid_generic mptscsih raid0 e1000e multipath ahci usbhid mptbase ptp usb_storage psmouse hid libahci scsi_transport_sas pps_core linear [last unloaded: test_mod]
Dec 22 14:02:42 test_server kernel: [41114.400006] CPU: 3 PID: 2820 Comm: ctl Tainted:     G           O 3.13.11.4 #1
Dec 22 14:02:42 test_server kernel: [41114.400006] Hardware name: Sun Microsystems SUN FIRE     X4150/SUN FIRE X4150, BIOS 1ADQW068 11/16/2010
Dec 22 14:02:42 test_server kernel: [41114.400006] task: ffff8807ed7c5fc0 ti: ffff8807ef498000 task.ti: ffff8807ef498000
Dec 22 14:02:42 test_server kernel: [41114.400006] RIP: 0010:[<ffffffffa02cd847>]  [<ffffffffa02cd847>] mod_start_trace+0x157/0x1a0 [test_mod]
Dec 22 14:02:42 test_server kernel: [41114.400006] RSP: 0018:ffff8807ef499e88  EFLAGS: 00010246
Dec 22 14:02:42 test_server kernel: [41114.400006] RAX: 0000000000000000 RBX: ffff88002bd3c000 RCX: 0000000000000006
Dec 22 14:02:42 test_server kernel: [41114.400006] RDX: 0000000000000007 RSI: 0000000000000046 RDI: ffffffffa02cfe71
Dec 22 14:02:42 test_server kernel: [41114.400006] RBP: ffff8807ef499ea8 R08: 0000000000000092 R09: 0000000000000afc
Dec 22 14:02:42 test_server kernel: [41114.400006] R10: 0000000000000000 R11: ffff8807ef499bb6 R12: ffffffffa02d1453
Dec 22 14:02:42 test_server kernel: [41114.400006] R13: ffff8807efe64160 R14: ffffffffa02d12a0 R15: 0000000000000000
Dec 22 14:02:42 test_server kernel: [41114.400006] FS:  00007fc797461880(0000) GS:ffff88081fcc0000(0000) knlGS:0000000000000000
Dec 22 14:02:42 test_server kernel: [41114.400006] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Dec 22 14:02:42 test_server kernel: [41114.400006] CR2: 0000000000002c28 CR3: 00000007f1295000 CR4: 00000000000027e0
Dec 22 14:02:42 test_server kernel: [41114.400006] DR0: 00000000000000a0 DR1: 0000000000000000 DR2: 0000000000000003
Dec 22 14:02:42 test_server kernel: [41114.400006] DR3: 00000000000000b0 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Dec 22 14:02:42 test_server kernel: [41114.400006] Stack:
Dec 22 14:02:42 test_server kernel: [41114.400006]  ffff8807f0bef800 00007fc79747f000 0000000000000014 ffff8807efe64160
Dec 22 14:02:42 test_server kernel: [41114.400006]  ffff8807ef499ed8 ffffffffa02ce368 ffff8807d1412480 00007fc79747f000
Dec 22 14:02:42 test_server kernel: [41114.400006]  ffff8807ef499f50 0000000000000014 ffff8807ef499ef8 ffffffff812243ad
Dec 22 14:02:42 test_server kernel: [41114.400006] Call Trace:
Dec 22 14:02:42 test_server kernel: [41114.400006]  [<ffffffffa02ce368>] mod_test_proc_control+0x548/0x5b0 [mod]
Dec 22 14:02:42 test_server kernel: [41114.400006]  [<ffffffff812243ad>] proc_reg_write+0x3d/0x80
Dec 22 14:02:42 test_server kernel: [41114.400006]  [<ffffffff811bcb54>] vfs_write+0xb4/0x1f0
Dec 22 14:02:42 test_server kernel: [41114.400006]  [<ffffffff811bd589>] SyS_write+0x49/0xa0
Dec 22 14:02:42 test_server kernel: [41114.400006]  [<ffffffff8172d42d>] system_call_fastpath+0x1a/0x1f
Dec 22 14:02:42 test_server kernel: [41114.400006] Code: 44 e1 31 c0 48 c7 c7 10 f9 2c a0 e8 54 ad 44 e1 48 85 db 74 98 4c 89 e7 e8 57 5a 45 e1 49 8b 86 a3 01 00 00 48 c7 c7 71 fe 2c a0 <c7> 80 28 2c 00 00 01 00 00 00 31 c0 e8 28 ad 44 e1 4c 89 e7 e8
Dec 22 14:02:42 gol3430-01 kernel: [41114.400006] RIP  [<ffffffffa02cd847>] mod_start_trace+0x157/0x1a0 [mod]
Dec 22 14:02:42 test_server kernel: [41114.400006]  RSP <ffff8807ef499e88>
Dec 22 14:02:42 test_server kernel: [41114.400006] CR2: 0000000000002c28
Dec 22 14:02:42 test_server kernel: [41114.440174] ---[ end trace 37ddb83f133ddac1 ]---

我之前遇到过此错误,而且我的代码出现了问题,而bios已经过时了。我一直在做一些研究,而且我刚刚看到了关于bios的东西。我刚刚开始使用atomic操作编程,我的故障排除尝试也造成了问题。非常感谢任何帮助或开始故障排除的方向。再次,此代码在另一台机器上工作正常。谢谢!

0 个答案:

没有答案