我想以某种方式“监视”Linux内核中的变量(或内存地址)(确切地说是内核模块/驱动程序);并找出改变它的原因 - 基本上,当变量改变时打印出堆栈跟踪。
例如,在this answer末尾列出的内核模块testjiffy-hr.c
中,我希望每次runcount
变量更改时打印出堆栈跟踪;希望堆栈跟踪会包含testjiffy_timer_function
的提及,这确实是改变该变量的函数。
现在,我知道我可以使用kgdb
连接到在虚拟机中运行的调试Linux内核,甚至可以设置断点(所以希望也是观察点) - 但问题是我实际上想调试一个ALSA驱动程序,特别是回放dma_area
缓冲区(我得到一些意想不到的数据) - 这对时序非常敏感;并且运行调试内核本身会弄乱时间(更不用说在虚拟机中运行它)了。
这里更大的问题是回放dma_area
指针仅在回放操作期间(或换句话说,在_start
和_stop
处理程序之间)存在 - 所以我会必须在每个dma_area
回调中记录_start
地址,然后以某种方式“安排”它在播放操作期间“观看”。
所以我希望有一种方法可以直接在驱动程序代码中执行这样的操作 - 例如,在此_start
回调中添加一些记录dma_area
指针的代码,并将其用作一个命令的参数,它启动“观察”变化;从相应的回调函数打印堆栈跟踪。 (我知道这也会影响时间,但我希望它能够“轻松”,不会过多地影响“实时”驱动程序操作)。
所以我的问题是:在Linux内核中调试这种技术是否存在?
如果不是:是否可以设置硬件(或软件)中断,这会对特定内存地址的更改做出反应?然后,我可以设置这样的中断处理程序,可以打印出堆栈跟踪吗? (虽然,我认为整个上下文在IRQ处理程序运行时会发生变化,因此可能会出现堆栈跟踪错误)?
如果没有:是否还有其他技术,这将允许我打印进程的堆栈跟踪,该跟踪更改存储在内核中给定内存位置的值(希望在实时的非调试内核中)?
答案 0 :(得分:12)
非常感谢@CosminRatiu和Eugene的回复;感谢那些,我发现:
...我可以用它来开发我在这里发布的示例,testhrarr.c
内核模块/驱动程序和Makefile
(下面)。它表明硬件观察点跟踪可以通过两种方式实现:使用perf
程序,它可以不变地探测驱动程序;或者通过向驱动程序添加一些硬件断点代码(在示例中,由HWDEBUG_STACK
define变量包围)。
本质上,调试像int这样的标准原子变量类型(如runcount
变量)的内容很简单,只要它们被定义为内核模块中的全局变量,因此它们最终显示为内核全球的象征。因此,下面的代码将testhrarr_
作为前缀添加到变量中(以避免命名冲突)。但是,由于需要解除引用,调试数组的内容可能有点棘手 - 这就是本文演示的内容,调试testhrarr_arr
数组的第一个字节。它完成于:
$ echo `cat /etc/lsb-release`
DISTRIB_ID=Ubuntu DISTRIB_RELEASE=11.04 DISTRIB_CODENAME=natty DISTRIB_DESCRIPTION="Ubuntu 11.04"
$ uname -a
Linux mypc 2.6.38-16-generic #67-Ubuntu SMP Thu Sep 6 18:00:43 UTC 2012 i686 i686 i386 GNU/Linux
$ cat /proc/cpuinfo | grep "model name"
model name : Intel(R) Atom(TM) CPU N450 @ 1.66GHz
model name : Intel(R) Atom(TM) CPU N450 @ 1.66GHz
testhrarr
模块基本上在模块初始化时为小数组分配内存,设置定时器函数,并公开/proc/testhrarr_proc
文件(使用较新的proc_create
接口)。然后,尝试从/proc/testhrarr_proc
文件读取(例如,使用cat
)将触发计时器功能,该功能将修改testhrarr_arr
数组值,并将邮件转储到/var/log/syslog
。我们希望testhrarr_arr[0]
在操作期间会改变三次;一次在testhrarr_startup
,一次在testhrarr_timer_function
(由于包装)。
perf
使用make
构建模块后,您可以使用以下命令加载它:
sudo insmod ./testhrarr.ko
此时,/var/log/syslog
将包含:
kernel: [40277.199913] Init testhrarr: 0 ; HZ: 250 ; 1/HZ (ms): 4 ; hrres: 0.000000001
kernel: [40277.199930] Addresses: _runcount 0xf84be22c ; _arr 0xf84be2a0 ; _arr[0] 0xed182a80 (0xed182a80) ; _timer_function 0xf84bc1c3 ; my_hrtimer 0xf84be260; my_hrt.f 0xf84be27c
kernel: [40277.220329] HW Breakpoint for testhrarr_arr write installed (0xf84be2a0)
请注意,仅将testhrarr_arr
作为硬件监视点的符号进行扫描会扫描该变量的地址(0xf84be2a0
),而不是数组的第一个元素的地址(0xed182a80
)!因此,硬件断点不将触发 - 因此行为就像硬件断点代码根本不存在一样(可以通过取消定义HWDEBUG_STACK
来实现)!
因此,即使没有通过内核模块代码设置硬件断点,我们仍然可以使用perf
来观察内存地址的变化 - 在perf
中,我们指定了我们要监视的地址(这里是testhrarr_arr
的第一个元素的地址,0xed182a80
),以及应该运行的流程:这里我们运行bash
,因此我们可以执行cat /proc/testhrarr_proc
将触发内核模块定时器,然后是sleep 0.5
,这将允许定时器完成。还需要-a
参数,否则可能会错过某些事件:
$ sudo perf record -a --call-graph --event=mem:0xed182a80:w bash -c 'cat /proc/testhrarr_proc ; sleep 0.5'
testhrarr proc: startup
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.485 MB perf.data (~21172 samples) ]
此时,/var/log/syslog
还会包含以下内容:
[40822.114964] testhrarr_timer_function: testhrarr_runcount 0 [40822.114980] testhrarr jiffies 10130528 ; ret: 1 ; ktnsec: 40822114975062 [40822.118956] testhrarr_timer_function: testhrarr_runcount 1 [40822.118977] testhrarr jiffies 10130529 ; ret: 1 ; ktnsec: 40822118973195 [40822.122940] testhrarr_timer_function: testhrarr_runcount 2 [40822.122956] testhrarr jiffies 10130530 ; ret: 1 ; ktnsec: 40822122951143 [40822.126962] testhrarr_timer_function: testhrarr_runcount 3 [40822.126978] testhrarr jiffies 10130531 ; ret: 1 ; ktnsec: 40822126973583 [40822.130941] testhrarr_timer_function: testhrarr_runcount 4 [40822.130961] testhrarr jiffies 10130532 ; ret: 1 ; ktnsec: 40822130955167 [40822.134940] testhrarr_timer_function: testhrarr_runcount 5 [40822.134962] testhrarr jiffies 10130533 ; ret: 1 ; ktnsec: 40822134958888 [40822.138936] testhrarr_timer_function: testhrarr_runcount 6 [40822.138958] testhrarr jiffies 10130534 ; ret: 1 ; ktnsec: 40822138955693 [40822.142940] testhrarr_timer_function: testhrarr_runcount 7 [40822.142962] testhrarr jiffies 10130535 ; ret: 1 ; ktnsec: 40822142959345 [40822.146936] testhrarr_timer_function: testhrarr_runcount 8 [40822.146957] testhrarr jiffies 10130536 ; ret: 1 ; ktnsec: 40822146954479 [40822.150949] testhrarr_timer_function: testhrarr_runcount 9 [40822.150970] testhrarr jiffies 10130537 ; ret: 1 ; ktnsec: 40822150963438 [40822.154974] testhrarr_timer_function: testhrarr_runcount 10 [40822.154988] testhrarr [ 5, 7, 9, 11, 13, ]
要阅读perf
(名为perf.data
的文件)的捕获,我们可以使用:
$ sudo perf report --call-graph flat --stdio No kallsyms or vmlinux with build-id 5031df4d8668bcc45a7bdb4023909c6f8e2d3d34 was found [testhrarr] with build id 5031df4d8668bcc45a7bdb4023909c6f8e2d3d34 not found, continuing without symbols Failed to open /bin/cat, continuing without symbols Failed to open /usr/lib/libpixman-1.so.0.20.2, continuing without symbols Failed to open /usr/lib/xorg/modules/drivers/intel_drv.so, continuing without symbols Failed to open /usr/bin/Xorg, continuing without symbols # Events: 5 unknown # # Overhead Command Shared Object Symbol # ........ ....... ............. .................................... # 87.50% Xorg [testhrarr] [k] testhrarr_timer_function 87.50% testhrarr_timer_function __run_hrtimer hrtimer_interrupt smp_apic_timer_interrupt apic_timer_interrupt 0x30185d 0x2ed701 0x2ed8cc 0x2edba0 0x9d0386 0x8126fc8 0x81217a1 0x811bdd3 0x8070aa7 0x806281c __libc_start_main 0x8062411 6.25% cat [testhrarr] [k] testhrarr_timer_function 6.25% testhrarr_timer_function testhrarr_proc_show seq_read proc_reg_read vfs_read sys_read syscall_call 0xaa2416 0x8049f4d __libc_start_main 0x8049081 3.12% swapper [testhrarr] [k] testhrarr_timer_function 3.12% testhrarr_timer_function __run_hrtimer hrtimer_interrupt smp_apic_timer_interrupt apic_timer_interrupt cpuidle_idle_call cpu_idle start_secondary 3.12% cat [testhrarr] [k] 0x356 3.12% 0xf84bc356 0xf84bc3a7 seq_read proc_reg_read vfs_read sys_read syscall_call 0xaa2416 0x8049f4d __libc_start_main 0x8049081 # # (For a higher level overview, try: perf report --sort comm,dso) #
因此,由于我们正在构建内核模块并在-g
中Makefile
进行调试,因此perf
找到此模块并不是问题。 ; s符号,即使活动内核不是调试内核。因此,它在大多数情况下正确地将testhrarr_timer_function
解释为设置者,尽管它没有报告testhrarr_startup
(但它会报告testhrarr_proc_show
调用它)。还有对0xf84bc3a7
和0xf84bc356
的引用,它无法解决;但请注意,该模块已加载到0xf84bc000
:
$ sudo cat /proc/modules | grep testhr
testhrarr 13433 0 - Live 0xf84bc000
...而且该条目也以...[k] 0x356
开头;如果我们查看内核模块的objdump
:
$ objdump -S testhrarr.ko | less ... 00000323 : static void testhrarr_startup(void) { ... testhrarr_arr[0] = 0; //just the first element 34b: a1 80 00 00 00 mov 0x80,%eax 350: c7 00 00 00 00 00 movl $0x0,(%eax) hrtimer_start(&my_hrtimer, ktime_period_ns, HRTIMER_MODE_REL); 356: c7 04 24 01 00 00 00 movl $0x1,(%esp) ********** 35d: 8b 15 1c 00 00 00 mov 0x1c,%edx ... 00000375 : static int testhrarr_proc_show(struct seq_file *m, void *v) { ... seq_printf(m, "testhrarr proc: startup\n"); 38f: c7 44 24 04 79 00 00 movl $0x79,0x4(%esp) 396: 00 397: 8b 45 fc mov -0x4(%ebp),%eax 39a: 89 04 24 mov %eax,(%esp) 39d: e8 fc ff ff ff call 39e testhrarr_startup(); 3a2: e8 7c ff ff ff call 323 3a7: eb 1c jmp 3c5 ********** } else { seq_printf(m, "testhrarr proc: (is running, %d)\n", testhrarr_runcount); 3a9: a1 0c 00 00 00 mov 0xc,%eax ...
...所以0xf84bc356
显然是指hrtimer_start
;和0xf84bc3a7
- > 3a7
指的是其调用testhrarr_proc_show
函数;值得庆幸的是。 (请注意,我已经体验过不同版本的驱动程序,_start
可以显示,timer_function
由纯粹地址表示;不确定这是什么原因)。
perf
的一个问题是,它给了我一个统计数据" Overhead"这些函数发生了什么(不确定它是指什么 - 可能是在函数的进入和退出之间花费的时间?) - 但我真正需要的是堆栈跟踪的日志,它是顺序的。不确定是否可以为此设置perf
- 但绝对可以使用硬件断点的内核模块代码完成。
HWDEBUG_STACK
中的代码实现了HW断点的设置和处理。如上所述,符号ksym_name
(如果未指定)的默认设置为testhrarr_arr
,它根本不会触发硬件断点。 ksym_name
期间可以在命令行中指定insmod
参数;在这里我们可以注意到:
$ sudo rmmod testhrarr # remove module if still loaded
$ sudo insmod ./testhrarr.ko ksym=testhrarr_arr[0]
... HW Breakpoint for testhrarr_arr[0] write installed (0x (null))
中/var/log/syslog
的结果; - 这意味着我们不能使用带括号表示法的符号进行数组访问;谢天谢地,这里的空指针只是意味着HW断点将再次不会触发;它并没有完全崩溃操作系统:)
然而,有一个全局变量用于引用testhrarr_arr
数组的第一个元素,称为testhrarr_arr_first
- 请注意如何在代码中专门处理此全局变量,并且需要解除引用,以便获得正确的地址。所以我们这样做:
$ sudo rmmod testhrarr # remove module if still loaded
$ sudo insmod ./testhrarr.ko ksym=testhrarr_arr_first
...并且syslog通知:
kernel: [43910.509726] Init testhrarr: 0 ; HZ: 250 ; 1/HZ (ms): 4 ; hrres: 0.000000001
kernel: [43910.509765] Addresses: _runcount 0xf84be22c ; _arr 0xf84be2a0 ; _arr[0] 0xedf6c5c0 (0xedf6c5c0) ; _timer_function 0xf84bc1c3 ; my_hrtimer 0xf84be260; my_hrt.f 0xf84be27c
kernel: [43910.538535] HW Breakpoint for testhrarr_arr_first write installed (0xedf6c5c0)
...我们可以看到HW断点设置为0xedf6c5c0
,这是testhrarr_arr[0]
的地址。现在,如果我们通过/proc
文件触发驱动程序:
$ cat /proc/testhrarr_proc
testhrarr proc: startup
...我们在syslog
中获得:
kernel: [44069.735695] testhrarr_arr_first value is changed [44069.735711] Pid: 29320, comm: cat Not tainted 2.6.38-16-generic #67-Ubuntu [44069.735719] Call Trace: [44069.735737] [] ? sample_hbp_handler+0x2d/0x3b [testhrarr] [44069.735755] [] ? __perf_event_overflow+0x90/0x240 [44069.735768] [] ? proc_alloc_inode+0x23/0x90 [44069.735778] [] ? proc_alloc_inode+0x23/0x90 [44069.735790] [] ? perf_swevent_event+0x136/0x140 [44069.735801] [] ? perf_bp_event+0x70/0x80 [44069.735812] [] ? prep_new_page+0x110/0x1a0 [44069.735824] [] ? get_page_from_freelist+0x12e/0x320 [44069.735836] [] ? seq_open+0x3d/0xa0 [44069.735848] [] ? hw_breakpoint_handler.clone.0+0x102/0x130 [44069.735861] [] ? hw_breakpoint_exceptions_notify+0x22/0x30 [44069.735872] [] ? notifier_call_chain+0x45/0x60 [44069.735883] [] ? atomic_notifier_call_chain+0x22/0x30 [44069.735894] [] ? notify_die+0x2d/0x30 [44069.735904] [] ? do_debug+0x88/0x180 [44069.735915] [] ? debug_stack_correct+0x30/0x38 [44069.735928] [] ? testhrarr_startup+0x33/0x52 [testhrarr] [44069.735940] [] ? testhrarr_proc_show+0x32/0x57 [testhrarr] [44069.735952] [] ? seq_read+0x145/0x390 [44069.735963] [] ? seq_read+0x0/0x390 [44069.735973] [] ? proc_reg_read+0x64/0xa0 [44069.735985] [] ? vfs_read+0x9f/0x160 [44069.735995] [] ? proc_reg_read+0x0/0xa0 [44069.736003] [] ? sys_read+0x42/0x70 [44069.736013] [] ? syscall_call+0x7/0xb [44069.736019] Dump stack from sample_hbp_handler [44069.740132] testhrarr_timer_function: testhrarr_runcount 0 [44069.740146] testhrarr jiffies 10942435 ; ret: 1 ; ktnsec: 44069740142485 [44069.740159] testhrarr_arr_first value is changed [44069.740169] Pid: 4302, comm: gnome-terminal Not tainted 2.6.38-16-generic #67-Ubuntu [44069.740176] Call Trace: [44069.740195] [] ? sample_hbp_handler+0x2d/0x3b [testhrarr] [44069.740213] [] ? __perf_event_overflow+0x90/0x240 [44069.740227] [] ? perf_swevent_event+0x136/0x140 [44069.740239] [] ? perf_bp_event+0x70/0x80 [44069.740253] [] ? sched_clock_local+0xd3/0x1c0 [44069.740267] [] ? format_decode+0x323/0x380 [44069.740280] [] ? hw_breakpoint_handler.clone.0+0x102/0x130 [44069.740292] [] ? hw_breakpoint_exceptions_notify+0x22/0x30 [44069.740302] [] ? notifier_call_chain+0x45/0x60 [44069.740313] [] ? atomic_notifier_call_chain+0x22/0x30 [44069.740324] [] ? notify_die+0x2d/0x30 [44069.740335] [] ? do_debug+0x88/0x180 [44069.740345] [] ? debug_stack_correct+0x30/0x38 [44069.740364] [] ? init_intel_cacheinfo+0x103/0x394 [44069.740379] [] ? testhrarr_timer_function+0xed/0x160 [testhrarr] [44069.740391] [] ? __run_hrtimer+0x6f/0x190 [44069.740404] [] ? testhrarr_timer_function+0x0/0x160 [testhrarr] [44069.740416] [] ? hrtimer_interrupt+0x108/0x240 [44069.740430] [] ? smp_apic_timer_interrupt+0x56/0x8a [44069.740441] [] ? apic_timer_interrupt+0x31/0x38 [44069.740453] [] ? _raw_spin_unlock_irqrestore+0x15/0x20 [44069.740465] [] ? try_to_del_timer_sync+0x67/0xb0 [44069.740476] [] ? del_timer_sync+0x29/0x50 [44069.740486] [] ? flush_delayed_work+0x13/0x40 [44069.740500] [] ? tty_flush_to_ldisc+0x12/0x20 [44069.740510] [] ? n_tty_poll+0x4f/0x190 [44069.740523] [] ? tty_poll+0x6d/0x90 [44069.740531] [] ? n_tty_poll+0x0/0x190 [44069.740542] [] ? do_poll.clone.3+0xd0/0x210 [44069.740553] [] ? do_sys_poll+0x134/0x1e0 [44069.740563] [] ? __pollwait+0x0/0xd0 [44069.740572] [] ? pollwake+0x0/0x60 ... [44069.740742] [] ? pollwake+0x0/0x60 [44069.740757] [] ? rw_verify_area+0x6c/0x130 [44069.740770] [] ? ktime_get_ts+0xf8/0x120 [44069.740781] [] ? poll_select_set_timeout+0x64/0x70 [44069.740793] [] ? sys_poll+0x5a/0xd0 [44069.740804] [] ? syscall_call+0x7/0xb [44069.740815] [] ? init_intel_cacheinfo+0x23/0x394 [44069.740822] Dump stack from sample_hbp_handler [44069.744130] testhrarr_timer_function: testhrarr_runcount 1 [44069.744143] testhrarr jiffies 10942436 ; ret: 1 ; ktnsec: 44069744140055 [44069.748132] testhrarr_timer_function: testhrarr_runcount 2 [44069.748145] testhrarr jiffies 10942437 ; ret: 1 ; ktnsec: 44069748141271 [44069.752131] testhrarr_timer_function: testhrarr_runcount 3 [44069.752145] testhrarr jiffies 10942438 ; ret: 1 ; ktnsec: 44069752141164 [44069.756131] testhrarr_timer_function: testhrarr_runcount 4 [44069.756141] testhrarr jiffies 10942439 ; ret: 1 ; ktnsec: 44069756138318 [44069.760130] testhrarr_timer_function: testhrarr_runcount 5 [44069.760141] testhrarr jiffies 10942440 ; ret: 1 ; ktnsec: 44069760138469 [44069.760154] testhrarr_arr_first value is changed [44069.760164] Pid: 4302, comm: gnome-terminal Not tainted 2.6.38-16-generic #67-Ubuntu [44069.760170] Call Trace: [44069.760187] [] ? sample_hbp_handler+0x2d/0x3b [testhrarr] [44069.760202] [] ? __perf_event_overflow+0x90/0x240 [44069.760213] [] ? perf_swevent_event+0x136/0x140 [44069.760224] [] ? perf_bp_event+0x70/0x80 [44069.760235] [] ? sched_clock_local+0xd3/0x1c0 [44069.760247] [] ? format_decode+0x323/0x380 [44069.760258] [] ? hw_breakpoint_handler.clone.0+0x102/0x130 [44069.760269] [] ? hw_breakpoint_exceptions_notify+0x22/0x30 [44069.760279] [] ? notifier_call_chain+0x45/0x60 [44069.760289] [] ? atomic_notifier_call_chain+0x22/0x30 [44069.760299] [] ? notify_die+0x2d/0x30 [44069.760308] [] ? do_debug+0x88/0x180 [44069.760318] [] ? debug_stack_correct+0x30/0x38 [44069.760334] [] ? init_intel_cacheinfo+0x103/0x394 [44069.760345] [] ? testhrarr_timer_function+0xed/0x160 [testhrarr] [44069.760356] [] ? __run_hrtimer+0x6f/0x190 [44069.760366] [] ? send_to_group.clone.1+0xf8/0x150 [44069.760376] [] ? testhrarr_timer_function+0x0/0x160 [testhrarr] [44069.760387] [] ? hrtimer_interrupt+0x108/0x240 [44069.760396] [] ? fsnotify+0x1a5/0x290 [44069.760407] [] ? smp_apic_timer_interrupt+0x56/0x8a [44069.760416] [] ? apic_timer_interrupt+0x31/0x38 [44069.760428] [] ? mem_cgroup_resize_limit+0x108/0x1c0 [44069.760437] [] ? fput+0x0/0x30 [44069.760446] [] ? sys_write+0x67/0x70 [44069.760455] [] ? syscall_call+0x7/0xb [44069.760464] [] ? init_intel_cacheinfo+0x23/0x394 [44069.760470] Dump stack from sample_hbp_handler [44069.764134] testhrarr_timer_function: testhrarr_runcount 6 [44069.764147] testhrarr jiffies 10942441 ; ret: 1 ; ktnsec: 44069764144141 [44069.768133] testhrarr_timer_function: testhrarr_runcount 7 [44069.768146] testhrarr jiffies 10942442 ; ret: 1 ; ktnsec: 44069768142976 [44069.772134] testhrarr_timer_function: testhrarr_runcount 8 [44069.772148] testhrarr jiffies 10942443 ; ret: 1 ; ktnsec: 44069772144121 [44069.776132] testhrarr_timer_function: testhrarr_runcount 9 [44069.776145] testhrarr jiffies 10942444 ; ret: 1 ; ktnsec: 44069776141971 [44069.780133] testhrarr_timer_function: testhrarr_runcount 10 [44069.780141] testhrarr [ 5, 7, 9, 11, 13, ]
...我们得到一个堆栈跟踪正好三次 - 一次在testhrarr_startup
期间,两次在testhrarr_timer_function
:一次用于runcount==0
一次用于runcount==5
,如预期的那样。
嗯,希望这有助于某人,
干杯!
<强> Makefile
强>
CONFIG_MODULE_FORCE_UNLOAD=y
# debug build:
# "CFLAGS was changed ... Fix it to use EXTRA_CFLAGS."
override EXTRA_CFLAGS+=-g -O0
obj-m += testhrarr.o
#testhrarr-objs := testhrarr.o
all:
@echo EXTRA_CFLAGS = $(EXTRA_CFLAGS)
make -C /lib/modules/$(shell uname -r)/build M=$(PWD) modules
clean:
make -C /lib/modules/$(shell uname -r)/build M=$(PWD) clean
<强> testhrarr.c
强>
/*
* [http://www.tldp.org/LDP/lkmpg/2.6/html/lkmpg.html#AEN189 The Linux Kernel Module Programming Guide]
* https://stackoverflow.com/questions/16920238/reliability-of-linux-kernel-add-timer-at-resolution-of-one-jiffy/17055867#17055867
* https://stackoverflow.com/questions/8516021/proc-create-example-for-kernel-module/18924359#18924359
* http://lxr.free-electrons.com/source/samples/hw_breakpoint/data_breakpoint.c
*/
#include <linux/module.h> /* Needed by all modules */
#include <linux/kernel.h> /* Needed for KERN_INFO */
#include <linux/init.h> /* Needed for the macros */
#include <linux/jiffies.h>
#include <linux/time.h>
#include <linux/proc_fs.h> /* /proc entry */
#include <linux/seq_file.h> /* /proc entry */
#define ARRSIZE 5
#define MAXRUNS 2*ARRSIZE
#include <linux/hrtimer.h>
#define HWDEBUG_STACK 1
#if (HWDEBUG_STACK == 1)
#include <linux/perf_event.h>
#include <linux/hw_breakpoint.h>
struct perf_event * __percpu *sample_hbp;
static char ksym_name[KSYM_NAME_LEN] = "testhrarr_arr";
module_param_string(ksym, ksym_name, KSYM_NAME_LEN, S_IRUGO);
MODULE_PARM_DESC(ksym, "Kernel symbol to monitor; this module will report any"
" write operations on the kernel symbol");
#endif
static volatile int testhrarr_runcount = 0;
static volatile int testhrarr_isRunning = 0;
static unsigned long period_ms;
static unsigned long period_ns;
static ktime_t ktime_period_ns;
static struct hrtimer my_hrtimer;
static int* testhrarr_arr;
static int* testhrarr_arr_first;
static enum hrtimer_restart testhrarr_timer_function(struct hrtimer *timer)
{
unsigned long tjnow;
ktime_t kt_now;
int ret_overrun;
printk(KERN_INFO
" %s: testhrarr_runcount %d \n",
__func__, testhrarr_runcount);
if (testhrarr_runcount < MAXRUNS) {
tjnow = jiffies;
kt_now = hrtimer_cb_get_time(&my_hrtimer);
ret_overrun = hrtimer_forward(&my_hrtimer, kt_now, ktime_period_ns);
printk(KERN_INFO
" testhrarr jiffies %lu ; ret: %d ; ktnsec: %lld\n",
tjnow, ret_overrun, ktime_to_ns(kt_now));
testhrarr_arr[(testhrarr_runcount % ARRSIZE)] += testhrarr_runcount;
testhrarr_runcount++;
return HRTIMER_RESTART;
}
else {
int i;
testhrarr_isRunning = 0;
// do not use KERN_DEBUG etc, if printk buffering until newline is desired!
printk("testhrarr_arr [ ");
for(i=0; i<ARRSIZE; i++) {
printk("%d, ", testhrarr_arr[i]);
}
printk("]\n");
return HRTIMER_NORESTART;
}
}
static void testhrarr_startup(void)
{
if (testhrarr_isRunning == 0) {
testhrarr_isRunning = 1;
testhrarr_runcount = 0;
testhrarr_arr[0] = 0; //just the first element
hrtimer_start(&my_hrtimer, ktime_period_ns, HRTIMER_MODE_REL);
}
}
static int testhrarr_proc_show(struct seq_file *m, void *v) {
if (testhrarr_isRunning == 0) {
seq_printf(m, "testhrarr proc: startup\n");
testhrarr_startup();
} else {
seq_printf(m, "testhrarr proc: (is running, %d)\n", testhrarr_runcount);
}
return 0;
}
static int testhrarr_proc_open(struct inode *inode, struct file *file) {
return single_open(file, testhrarr_proc_show, NULL);
}
static const struct file_operations testhrarr_proc_fops = {
.owner = THIS_MODULE,
.open = testhrarr_proc_open,
.read = seq_read,
.llseek = seq_lseek,
.release = single_release,
};
#if (HWDEBUG_STACK == 1)
static void sample_hbp_handler(struct perf_event *bp,
struct perf_sample_data *data,
struct pt_regs *regs)
{
printk(KERN_INFO "%s value is changed\n", ksym_name);
dump_stack();
printk(KERN_INFO "Dump stack from sample_hbp_handler\n");
}
#endif
static int __init testhrarr_init(void)
{
struct timespec tp_hr_res;
#if (HWDEBUG_STACK == 1)
struct perf_event_attr attr;
#endif
period_ms = 1000/HZ;
hrtimer_get_res(CLOCK_MONOTONIC, &tp_hr_res);
printk(KERN_INFO
"Init testhrarr: %d ; HZ: %d ; 1/HZ (ms): %ld ; hrres: %lld.%.9ld\n",
testhrarr_runcount, HZ, period_ms, (long long)tp_hr_res.tv_sec, tp_hr_res.tv_nsec );
testhrarr_arr = (int*)kcalloc(ARRSIZE, sizeof(int), GFP_ATOMIC);
testhrarr_arr_first = &testhrarr_arr[0];
hrtimer_init(&my_hrtimer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
my_hrtimer.function = &testhrarr_timer_function;
period_ns = period_ms*( (unsigned long)1E6L );
ktime_period_ns = ktime_set(0,period_ns);
printk(KERN_INFO
" Addresses: _runcount 0x%p ; _arr 0x%p ; _arr[0] 0x%p (0x%p) ; _timer_function 0x%p ; my_hrtimer 0x%p; my_hrt.f 0x%p\n",
&testhrarr_runcount, &testhrarr_arr, &(testhrarr_arr[0]), testhrarr_arr_first, &testhrarr_timer_function, &my_hrtimer, &my_hrtimer.function);
proc_create("testhrarr_proc", 0, NULL, &testhrarr_proc_fops);
#if (HWDEBUG_STACK == 1)
hw_breakpoint_init(&attr);
if (strcmp(ksym_name, "testhrarr_arr_first") == 0) {
// just for testhrarr_arr_first - interpret the found symbol address
// as int*, and dereference it to get the "real" address it points to
attr.bp_addr = *((int*)kallsyms_lookup_name(ksym_name));
} else {
// the usual - address is kallsyms_lookup_name result
attr.bp_addr = kallsyms_lookup_name(ksym_name);
}
attr.bp_len = HW_BREAKPOINT_LEN_1;
attr.bp_type = HW_BREAKPOINT_W ; //| HW_BREAKPOINT_R;
sample_hbp = register_wide_hw_breakpoint(&attr, (perf_overflow_handler_t)sample_hbp_handler);
if (IS_ERR((void __force *)sample_hbp)) {
int ret = PTR_ERR((void __force *)sample_hbp);
printk(KERN_INFO "Breakpoint registration failed\n");
return ret;
}
// explicit cast needed to show 64-bit bp_addr as 32-bit address
// https://stackoverflow.com/questions/11796909/how-to-resolve-cast-to-pointer-from-integer-of-different-size-warning-in-c-co/11797103#11797103
printk(KERN_INFO "HW Breakpoint for %s write installed (0x%p)\n", ksym_name, (void*)(uintptr_t)attr.bp_addr);
#endif
return 0;
}
static void __exit testhrarr_exit(void)
{
int ret_cancel = 0;
kfree(testhrarr_arr);
while( hrtimer_callback_running(&my_hrtimer) ) {
ret_cancel++;
}
if (ret_cancel != 0) {
printk(KERN_INFO " testhrarr Waited for hrtimer callback to finish (%d)\n", ret_cancel);
}
if (hrtimer_active(&my_hrtimer) != 0) {
ret_cancel = hrtimer_cancel(&my_hrtimer);
printk(KERN_INFO " testhrarr active hrtimer cancelled: %d (%d)\n", ret_cancel, testhrarr_runcount);
}
if (hrtimer_is_queued(&my_hrtimer) != 0) {
ret_cancel = hrtimer_cancel(&my_hrtimer);
printk(KERN_INFO " testhrarr queued hrtimer cancelled: %d (%d)\n", ret_cancel, testhrarr_runcount);
}
remove_proc_entry("testhrarr_proc", NULL);
#if (HWDEBUG_STACK == 1)
unregister_wide_hw_breakpoint(sample_hbp);
printk(KERN_INFO "HW Breakpoint for %s write uninstalled\n", ksym_name);
#endif
printk(KERN_INFO "Exit testhrarr\n");
}
module_init(testhrarr_init);
module_exit(testhrarr_exit);
MODULE_LICENSE("GPL");
答案 1 :(得分:1)
您需要硬件支持。 CPU需要检测何时写入某个内存地址并调用某些代码 - 中断或异常处理程序。根据我的经验,我在PowerPC平台上看过这个,但在x86上却看不到。它被称为硬件观察点。
理论上,如果你在模拟器中运行,你可以模拟这种行为,但我完全不熟悉当前存在的模拟器。
编辑:我已经挖了一点,似乎Linux中有一个通用的hw断点接口,x86有这样一个寄存器。它被称为DR7。 查看'include / linux / hw_breakpoint.h'中的函数。看起来像ptrace和/或perf使用这些接口。祝你好好调试它!