Question

我正在尝试从内核模块调用系统调用，我有这个代码：

    set_fs( get_ds() );    // lets our module do the system-calls 


    // Save everything before systemcalling

    asm ("     push    %rax     "); 
    asm  ("     push    %rdi     "); 
    asm  ("     push    %rcx     "); 
    asm  ("     push    %rsi     "); 
    asm  ("     push    %rdx     "); 
    asm  ("     push    %r10     "); 
    asm  ("     push    %r8      "); 
    asm  ("     push    %r9      "); 
    asm  ("     push    %r11     "); 
    asm  ("     push    %r12     "); 
    asm  ("     push    %r15     "); 
    asm  ("     push    %rbp     "); 
    asm  ("     push    %rbx     "); 


    // Invoke the long sys_mknod(const char __user *filename, int mode, unsigned dev);

    asm volatile ("     movq    $133, %rax     "); // system call number

    asm volatile ("    lea    path(%rip), %rdi     "); // path is char path[] = ".."

    asm volatile ("     movq    mode, %rsi     "); // mode is S_IFCHR | ...

    asm volatile ("     movq    dev, %rdx     ");  // dev is 70 >> 8

    asm volatile ("     syscall     "); 


      // POP EVERYTHING 

    asm ("     pop     %rbx     "); 
    asm ("     pop        %rbp     "); 
    asm ("     pop     %r15     "); 
    asm ("     pop        %r12     "); 
    asm ("     pop        %r11     "); 
    asm ("     pop        %r9      "); 
    asm ("     pop        %r8      "); 
    asm ("     pop        %r10     "); 
    asm ("     pop        %rdx     "); 
    asm ("     pop        %rsi     "); 
    asm ("     pop        %rcx     "); 
    asm ("     pop        %rdi     "); 
    asm ("     pop     %rax     "); 



    set_fs( savedFS );    // restore the former address-limit value

此代码无效，导致系统崩溃（它是内核模块）。

使用重定位信息转储该段代码是：

  2c:    50                      push  %rax 
  2d:    57                      push  %rdi 
  2e:    51                      push  %rcx 
  2f:    56                      push  %rsi 
  30:    52                      push  %rdx 
  31:    41 52                    push  %r10 
  33:    41 50                    push  %r8 
  35:    41 51                    push  %r9 
  37:    41 53                    push  %r11 
  39:    41 54                    push  %r12 
  3b:    41 57                    push  %r15 
  3d:    55                      push  %rbp 
  3e:    53                      push  %rbx 
  3f:    48 c7 c0 85 00 00 00     mov    $0x85,%rax 
  46:    48 8d 3d 00 00 00 00     lea    0x0(%rip),%rdi        # 4d <init_module+0x4d> 
            49: R_X86_64_PC32    path-0x4 
  4d:    48 83 c7 04              add    $0x4,%rdi 
  51:    48 8b 34 25 00 00 00     mov    0x0,%rsi 
  58:    00 
            55: R_X86_64_32S    mode 
  59:    48 8b 14 25 00 00 00     mov    0x0,%rdx 
  60:    00 
            5d: R_X86_64_32S    dev 
  61:    0f 05                    syscall 
  63:    5b                      pop    %rbx 
  64:    5d                      pop    %rbp 
  65:    41 5f                    pop    %r15 
  67:    41 5c                    pop    %r12 
  69:    41 5b                    pop    %r11 
  6b:    41 59                    pop    %r9 
  6d:    41 58                    pop    %r8 
  6f:    41 5a                    pop    %r10 
  71:    5a                      pop    %rdx 
  72:    5e                      pop    %rsi 
  73:    59                      pop    %rcx 
  74:    5f                      pop    %rdi 
  75:    58                      pop    %rax

我想知道..为什么在-0x4偏移量 49：R_X86_64_PC32 path-0x4？

我的意思是：模式和开发应该自动解决没有问题，但路径呢？为什么-0x4偏移？

我试图用

“补偿它”

lea 0x0（％rip），％rdi //这会以某种方式添加-0x4偏移量添加$ 0x4，％rdi ....

但代码仍然崩溃。

我哪里出错？

Answer 1

似乎你不能以这种方式做到这一点。请参阅system_call之前的comment：

 /*
  * Register setup:
  * rax  system call number
  * rdi  arg0
  * rcx  return address for syscall/sysret, C arg3
  * rsi  arg1
  * rdx  arg2
  * r10  arg3    (--> moved to rcx for C)
  * r8   arg4
  * r9   arg5
  * r11  eflags for syscall/sysret, temporary for C
  * r12-r15,rbp,rbx saved by C code, not touched.
  *
  * Interrupts are off on entry.
  * Only called from user space.
  *
  * XXX  if we had a free scratch register we could save the RSP into the stack frame
  *      and report it properly in ps. Unfortunately we haven't.
  *
  * When user can change the frames always force IRET. That is because
  * it deals with uncanonical addresses better. SYSRET has trouble
  * with them due to bugs in both AMD and Intel CPUs.
  */

因此，您无法从内核调用syscall。但您可以尝试将int $0x80用于此目的。我认为kernel_execve存根使用trick

Answer 2

我猜这里发生了什么是堆栈问题。与int $0x80不同，syscall指令不会为内核设置堆栈。如果您查看system_call:中的实际代码，您会看到SWAPGS_UNSAFE_STACK之类的内容。这个宏的内容是SwapGS指令 - 参见第152页here。当进入内核模式时，内核需要一种方法来拉取指向其数据结构的指针，而这条指令可以让它做到这一点。它通过将用户%gs寄存器与保存在特定于模型的寄存器中的值进行交换来实现，然后从该寄存器中拉出内核模式堆栈。

你可以想象一旦调用syscall入口点，这个交换产生了错误的值，因为你已经处于内核模式，并且内核开始尝试使用伪造的堆栈。您可以尝试手动调用SwapGS，使内核的SwapGS结果符合预期，并查看是否有效。

从内核崩溃的Linux系统调用（奇怪的偏移）

2 个答案: