打破ELF / Linux上的堆栈/调用帧信息链?

时间:2018-05-20 08:11:08

标签: x86-64 dwarf stack-unwinding

我试图做一个相当小众的事情,这实际上打破了CFI(DWARF EH信息中的呼叫帧信息)和rbp&帧之间的rsp链接。主要原因是,在线程控制流程中超过某个点我想做一个调用延续,这基本上是一个单向的尾调,结合了一个应该清理堆栈然后返回堆栈顶部的yield准备好在继续点再次执行。

原则上这个想法是有用的,只要我保持堆栈中的混乱注释:

    /*
     * x86_64 SysV:
     *   rdi, rsi, rds, rcx, r8, r9, xmm0-xmm7
     */
    __asm {
        mov  rax, TCB
        mov  rax, qword ptr [rax] OSThreadControlBlock.StartFn;
        call rax;
        mov  rax, 0;
        // end of stack
        //push rax;
        //push rax;
        //push rbx;
        // last "real" frame
        //push rbp;
        //mov  rbp, rsp;
        //push rbx;
        // make the call
        mov  rdi, RL;
        lea  rax, qword ptr __OS_RUNLOOP_START__;
        call rax;
        // trap if it returns
        //int  3;
    }

我了解SP / BP寄存器背后的一般原则,我特意使用-fno-omit-frame-pointer。我的问题是,在花了几个小时试图让它上班后,我错过了什么?似乎对堆栈布局的任何改变,即使像在调用之前一样简单地保持它对齐也会导致从这样的事情开始的雪球崩溃(自定义信号处理程序):

Received fatal signal: Segmentation fault (11) [thread: 10298 ctl-thrd]
 * Unknown error at address 0x0 Regs:
   %rip=0x00000000003E2D91 %rbp=0x00007F820A547EA8 %rsp=0x00007F820A547DE8 %rax=0x00007F820A547DE8 %rbx=0x00007F820A547F38
   %rdi=0x00000000002121E1 %rsi=0x000000000000007B %rcx=0x000000000000000A  %r8=0x0000000000000900  %r9=0x00007F820A5490C0

有问题的ABI在libc++ Linux上libc++abi / x86_64,基于LLVM / Clang 6.0.X的工具链。我几乎尝试了所有的东西,我知道上面看起来很奇怪,但它是内联汇编的MS扩展,我在反汇编中多次检查它生成了完美的代码。据我所知,这是CFI和基于帧指针的东西之间的一些奇怪的冲突,但我在x86_64并不是那么出色,所以我不确定我错过了什么。我知道展开过程意味着由一个标记终止(最后一帧上的空SP / FP),但此时我真的丢失了,因为即使调试器完全被它抛弃了。

如果有人有任何建议真的很感激,我尝试了各种各样的东西,但核心问题是一样的,只要我触摸堆栈,即使我恢复正常,一切都变得混乱。除了asm块之外的Clobber并不重要,因为最后一次调用并不意味着常规返回。我注意到的一件事是它似乎与TLV有某种联系,但我不确定NPTL是如何配置它的。

任何帮助或建议都会让我非常感激。

修改

看起来Valgrind发表的这篇评论可以解释发生了什么:

/* NB 9 Sept 07.  There is a nasty kludge here in all these CALL_FN_
   macros.  In order not to trash the stack redzone, we need to drop
   %rsp by 128 before the hidden call, and restore afterwards.  The
   nastyness is that it is only by luck that the stack still appears
   to be unwindable during the hidden call - since then the behaviour
   of any routine using this macro does not match what the CFI data
   says.  Sigh.

   Why is this important?  Imagine that a wrapper has a stack
   allocated local, and passes to the hidden call, a pointer to it.
   Because gcc does not know about the hidden call, it may allocate
   that local in the redzone.  Unfortunately the hidden call may then
   trash it before it comes to use it.  So we must step clear of the
   redzone, for the duration of the hidden call, to make it safe.

   Probably the same problem afflicts the other redzone-style ABIs too
   (ppc64-linux, ppc32-aix5, ppc64-aix5); but for those, the stack is
   self describing (none of this CFI nonsense) so at least messing
   with the stack pointer doesn't give a danger of non-unwindable
   stack. */

0 个答案:

没有答案