如何理解这个dmesg错误信息?

时间:2016-03-22 11:49:52

标签: c linux module linux-kernel

我已经编写了这个简单的模块来处理设备并调用它的一些电源管理方法,例如.suspend.resume。在初始化时,模块简单地查找特定设备并尝试调用其方法。

#include <linux/kernel.h>
#include <linux/module.h>
#include <linux/device.h>
#include <linux/pci.h>

static int __init mfps_driver_init(void){

struct pci_dev    *dev      = NULL;
struct pci_driver *driver   = NULL;
struct device     *device   = NULL;

dev = pci_get_device(0x8086, 0x15a2, NULL);

if((dev == NULL) || (dev == 0)){

    printk(KERN_INFO "LEONZO: NOTHING FOUND SIZE %ld\n", sizeof(dev));

} else {

    driver = dev->driver;

    printk(KERN_INFO "LEONZO: I FOUND THE DEVICE OF THE SIZE %ld\n", sizeof(dev));
    printk(KERN_INFO "LEONZO: HERE IS ITS DRIVER NAME %s\n", driver->name);
    printk(KERN_INFO "LEONZO: CALLING IT SUSPEND METHOD\n");

    *device = dev->dev;

    device_lock(device);

    device_unlock(device);
}

return 0;

}

static void __exit mfps_driver_exit(void){

}


module_init(mfps_driver_init);
module_exit(mfps_driver_exit);

代码编译成功。但是当我加载模块时,我得到了一个内核错误:

sudo insmod MyFirstPowerState.ko

dmesg显示以下输出

[   59.545180] MyFirstPowerState: module license 'unspecified' taints   kernel. 
[   59.545183] Disabling lock debugging due to kernel taint
[   59.546010] LEONZO: I FOUND THE DEVICE OF THE SIZE 8
[   59.546012] LEONZO: HERE IS ITS DRIVER NAME e1000e
[   59.546013] LEONZO: CALLING IT SUSPEND METHOD
[   59.546021] BUG: unable to handle kernel NULL pointer dereference         at           (null)
[   59.546051] IP: [<ffffffffc011907e>] mfps_driver_init+0x7e/0x1000         [MyFirstPowerState]
[   59.546077] PGD 0 
[   59.546085] Oops: 0002 [#1] SMP 
[   59.546097] Modules linked in: MyFirstPowerState(POE+) xt_CHECKSUM arc4 iwlmvm mac80211 snd_hda_codec_hdmi snd_hda_codec_realtek iwlwifi snd_hda_codec_generic rtsx_pci_ms memstick cfg80211 nf_conntrack_netbios_ns nf_conntrack_broadcast ipt_MASQUERADE nf_nat_masquerade_ipv4 xt_tcpudp ip6t_REJECT nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 xt_conntrack ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw iptable_filter ip_tables x_tables dm_crypt hp_wmi sparse_keymap intel_rapl iosf_mbi x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm crct10dif_pclmul dm_multipath crc32_pclmul scsi_dh aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd joydev serio_raw lpc_ich uvcvideo snd_seq_midi snd_seq_midi_event snd_rawmidi snd_hda_intel snd_hda_controller snd_hda_codec videobuf2_vmalloc snd_hwdep shpchp snd_pcm videobuf2_memops videobuf2_core v4l2_common snd_seq e1000e(OE) i915_bpo ptp mei_me pps_core mei videodev media snd_seq_device intel_ips snd_timer drm_kms_helper drm btusb snd i2c_algo_bit soundcore 8250_fintek hp_accel lis3lv02d input_polldev tpm_infineon hp_wireless mac_hid parport_pc ppdev lp parport rfcomm bnep bluetooth binfmt_misc btrfs xor raid6_pq dm_mirror dm_region_hash dm_log uas usb_storage hid_generic usbhid hid rtsx_pci_sdmmc ahci psmouse libahci rtsx_pci wmi video
[   59.546577] CPU: 1 PID: 4180 Comm: insmod Tainted: P           OE   3.19.0-51-generic #58~14.04.1-Ubuntu
[   59.546613] Hardware name: Hewlett-Packard HP EliteBook 840 G2/2216, BIOS M71 Ver. 01.05 03/26/2015
[   59.546648] task: ffff880241a7b110 ti: ffff880242f68000 task.ti: ffff880242f68000
[   59.546678] RIP: 0010:[<ffffffffc011907e>]  [<ffffffffc011907e>] mfps_driver_init+0x7e/0x1000 [MyFirstPowerState]
[   59.546720] RSP: 0018:ffff880242f6bd18  EFLAGS: 00010246
[   59.546741] RAX: 0000000000000000 RBX: ffff880245b4d000 RCX: 00000000000000ae
[   59.546772] RDX: 0000000000000000 RSI: ffff880245b4d098 RDI: 0000000000000000
[   59.546807] RBP: ffff880242f6bd28 R08: 000000000000000a R09: 0000000000000000
[   59.546839] R10: 0000000000000d53 R11: ffff880242f6b9de R12: ffffffffc06a8000
[   59.546868] R13: 0000000000000000 R14: ffffffffc0119000 R15: ffff880242f6bef8
[   59.546900] FS:  00007f8787aa6740(0000) GS:ffff88024f440000(0000) knlGS:0000000000000000
[   59.546921] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   59.546936] CR2: 0000000000000000 CR3: 0000000244393000 CR4: 00000000003407e0
[   59.546955] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   59.546978] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   59.547006] Stack:
[   59.547014]  ffffffff81c1d060 ffff880204cd3280 ffff880242f6bda8 ffffffff81002144
[   59.547046]  0000000000000001 0000000000000002 ffff8801f8ddc4c0 0000000000000001
[   59.547079]  ffff880242f6bd88 ffffffff811cef19 ffffffff810f7aac 0000000000000018
[   59.547114] Call Trace:
[   59.547131]  [<ffffffff81002144>] do_one_initcall+0xd4/0x210
[   59.547162]  [<ffffffff811cef19>] ? kmem_cache_alloc_trace+0x199/0x220
[   59.547194]  [<ffffffff810f7aac>] ? load_module+0x164c/0x1cc0
[   59.547222]  [<ffffffff810f7ae5>] load_module+0x1685/0x1cc0
[   59.547247]  [<ffffffff810f3380>] ? store_uevent+0x40/0x40
[   59.547274]  [<ffffffff810f8296>] SyS_finit_module+0x86/0xb0
[   59.547298]  [<ffffffff817b788d>] system_call_fastpath+0x16/0x1b
[   59.547314] Code: c7 80 c0 4b c0 31 c0 e8 19 14 69 c1 48 c7 c7 a8 c0  4b c0 31 c0 e8 0b 14 69 c1 31 c0 48 8d b3 98 00 00 00 b9 ae 00 00 00 48 89 c7 <f3> a5 bf 60 00 00 00 e8 26 c7 69 c1 bf 60 00 00 00 e8 ac c5 69 
[   59.547393] RIP  [<ffffffffc011907e>] mfps_driver_init+0x7e/0x1000 [MyFirstPowerState]
[   59.547416]  RSP <ffff880242f6bd18>
[   59.547425] CR2: 0000000000000000
[   59.554577] ---[ end trace 42e3b1c73677cdfa ]---

我还注意到因此无法移除模块:

sudo rmmod MyFirstPowerState.ko 
rmmod: ERROR: Module MyFirstPowerState is in use

知道这段代码的含义以及如何纠正错误?

1 个答案:

答案 0 :(得分:4)

我将尝试解释下面的大量文本墙。作为一个注释,左边括号中的值是我忘记它们究竟与它们有什么关系的时间,但对于你来说它们并不重要。

  

[59.545180] MyFirstPowerState:模块许可证&#39;未指定&#39;污点内核。   [59.545183]由于内核污染而禁用锁定调试

这是因为您没有声明模块许可证。通常,您会看到人们在与module_init相同的部分中将类似的内容放入其中。

MODULE_LICENSE("GPL");
  

[59.546010] LEONZO:我发现了大小的设备8   [59.546012] LEONZO:这是它的驱动程序名称e1000e   [59.546013] LEONZO:呼吁暂停方法

这些是你的printk消息,这里没什么特别的。

  

[59.546021] BUG:无法在(NULL)处理内核null指针取消引用

这就是崩溃原因的真正原因。内核试图取消引用一个导致seg错误的NULL指针。有关具体含义的详细信息,请参阅here。正如Ian在前面的评论中所指出的,看起来崩溃的原因是您放置了*device=dev->dev而不是device=dev->dev.在您尝试将值设备点分配给dev->dev的代码中但是,因为device=NULL目前你试图取消引用NULL导致崩溃。

  

[59.546051] IP:[] mfps_driver_init + 0x7e / 0x1000 [MyFirstPowerState]   [59.546648]任务:ffff880241a7b110 ti:ffff880242f68000 task.ti:ffff880242f68000

上面包含的大量错误对您来说没有多大价值,对于已部署某些内容且某些特定用户遇到问题的人来说更多。它列出的内容包括安装的硬件,导致崩溃的模块,以及同时调用您所知道的所有内容的模块。

  

[59.546678] RIP:0010:[] [] mfps_driver_init + 0x7e / 0x1000 [MyFirstPowerState] [59.547079] ffff880242f6bd88 ffffffff811cef19 ffffffff810f7aac 0000000000000018

本节中的所有内容都是装配信息,如果您没有装配经验对您没有任何意义,尽管我建议您了解它在这些情况下确实有帮助的基础知识。上半部分是寄存器及其当前值,下半部分是当前堆栈帧。

> [   59.547114] Call Trace:
[   59.547131]  [<ffffffff81002144>] do_one_initcall+0xd4/0x210
[   59.547162]  [<ffffffff811cef19>] ? kmem_cache_alloc_trace+0x199/0x220
[   59.547194]  [<ffffffff810f7aac>] ? load_module+0x164c/0x1cc0

调用跟踪中的所有内容都非常有用,尤其是当模块变长且难以使用中断等进行调试时。基本上它列出了系统导致此崩溃的每个函数调用(或其他)。在你的情况下,因为你从加载模块直接到崩溃,跟踪实际上只有你的load_module以及一些包装器和一些深层系统调用。但是,如果说你的加载模块调用另一个函数并导致崩溃,你可以在这里看到这个调用路径。

最后一点似乎是更多的寄存器。

希望这解释了当你导致内核问题时你从dmesg获得的文本墙(不确定这是否是一个恐慌,有人请纠正我)。如果有任何事情仍然含糊不清,我会尝试解释,虽然我绝不是这方面的专家。