Question

我希望我的异常处理程序和调试函数能够打印调用堆栈回溯，基本上就像glibc中的backtrace（）库函数一样。不幸的是，我的C库（Newlib）没有提供这样的调用。

我有这样的事情：

#include <unwind.h> // GCC's internal unwinder, part of libgcc
_Unwind_Reason_Code trace_fcn(_Unwind_Context *ctx, void *d)
{
    int *depth = (int*)d;
    printf("\t#%d: program counter at %08x\n", *depth, _Unwind_GetIP(ctx));
    (*depth)++;
    return _URC_NO_REASON;
}

void print_backtrace_here()
{
    int depth = 0;
    _Unwind_Backtrace(&trace_fcn, &depth);
}

基本上有效，但结果跟踪并不总是完整的。例如，如果我这样做

int func3() { print_backtrace_here(); return 0; }
int func2() { return func3(); }
int func1() { return func2(); }
int main()  { return func1(); }

backtrace只显示func3（）和main（）。（这是一个玩具示例，但我已经检查了反汇编，并确认这些功能全部都在这里，并没有优化或内联。）

更新：我在旧的ARM7系统上尝试了这个回溯代码，但是使用了相同的（或至少相同的）编译器选项和链接描述文件，它打印出正确的完整回溯（即func1和func2不会丢失）实际上它甚至会将过去的主进程回溯到引导初始化代码中。所以可能问题不在于链接器脚本或编译器选项。（另外，通过反汇编确认在此ARM7测试中也没有使用帧指针）。

代码使用-fomit-frame-pointer编译，但我的平台（裸机ARM Cortex M3）定义了一个不使用帧指针的ABI。（该系统的先前版本使用ARM7上的旧APCS ABI，具有强制堆栈帧和帧指针，以及类似here的回溯，其工作正常。

整个系统使用-fexception进行编译，这确保了_Unwind使用的必要元数据包含在ELF文件中。（_Unwind是为我认为的异常处理而设计的）。

所以，我的问题是： 是否存在使用GCC在嵌入式系统中获得可靠回溯的“标准”，可接受的方式？

我不介意在必要时乱用链接器脚本和crt0代码，但不想让工具链本身有任何机会。

谢谢！

Answer 1

为此，您需要-funwind-tables或-fasynchronous-unwind-tables 在某些目标中，这是_Unwind_Backtrace正常工作所必需的！

Answer 2

gcc会返回优化。在func1（）和func2（）中它不调用func2（）/ func3（） - 而不是这个，它跳转到func2（）/ func3（），因此func3（）可以立即返回到main（）。

在你的情况下，func1（）和func2（）不需要设置堆栈帧，但如果他们会这样做（例如对于局部变量），如果函数调用是最后一条指令，gcc仍然可以进行优化 - 然后在跳转到func3（）之前清理堆栈。

查看生成的汇编程序代码以查看它。

编辑/更新：

要验证这是原因，请在函数调用之后执行某些操作，编译器无法对其进行重新排序（例如，使用返回值）。或者只是尝试使用-O0进行编译。

Answer 3

由于ARM平台不使用帧指针，因此您永远不知道堆栈帧有多大，并且不能简单地将堆栈扩展到R14中的单个返回值之外。

在调查我们没有调试符号的崩溃时，我们只需转储整个堆栈并查找指令范围内每个项目的最近符号。它确实会产生大量的误报，但对于调查崩溃仍然非常有用。

如果您正在运行纯ELF可执行文件，则可以将调试符号与发布可执行文件分开。然后，gdb可以帮助您了解标准unix核心转储的内容

Answer 4

某些编译器（如GCC）会优化函数调用，就像您在示例中提到的那样。对于代码片段的操作，不需要在调用链中存储中间返回指针。从func3()返回到main()是完全可以的，因为除了调用另一个函数之外，中间函数不会做任何额外的事情。

它与代码消除不同（实际上中间函数可以完全优化），单独的编译器参数可以控制这种优化。

如果您使用GCC，请尝试-fno-optimize-sibling-calls

另一个方便的GCC选项是-mno-sched-prolog，它阻止了函数序言中的指令重新排序，如果你想逐字节地解析代码，这是至关重要的，就像在这里完成的那样： http://www.kegel.com/stackcheck/checkstack-pl.txt

Answer 5

这很hacky，但考虑到所需的代码/ RAM空间量，我发现它的效果非常好：

假设您正在使用ARM THUMB模式，请使用以下选项进行编译：

-mtpcs-frame -mtpcs-leaf-frame  -fno-omit-frame-pointer

以下函数用于检索callstack。有关详细信息，请参阅注释：

/*
 * This should be compiled with:
 *  -mtpcs-frame -mtpcs-leaf-frame  -fno-omit-frame-pointer
 *
 *  With these options, the Stack pointer is automatically pushed to the stack
 *  at the beginning of each function.
 *
 *  This function basically iterates through the current stack finding the following combination of values:
 *  - <Frame Address>
 *  - <Link Address>
 *
 *  This combination will occur for each function in the call stack
 */
static void backtrace(uint32_t *caller_list, const uint32_t *caller_list_end, const uint32_t *stack_pointer)
{
    uint32_t previous_frame_address = (uint32_t)stack_pointer;
    uint32_t stack_entry_counter = 0;

    // be sure to clear the caller_list buffer
    memset(caller_list, 0, caller_list_end-caller_list);

    // loop until the buffer is full
    while(caller_list < caller_list_end)
    {
        // Attempt to obtain next stack pointer
        // The link address should come immediately after
        const uint32_t possible_frame_address = *stack_pointer;
        const uint32_t possible_link_address = *(stack_pointer+1);

        // Have we searched past the allowable size of a given stack?
        if(stack_entry_counter > PLATFORM_MAX_STACK_SIZE/4)
        {
            // yes, so just quite
            break;
        }
        // Next check that the frame addresss (i.e. stack pointer for the function)
        // and Link address are within an acceptable range
        else if((possible_frame_address > previous_frame_address) &&
                ((possible_frame_address < previous_frame_address + PLATFORM_MAX_STACK_SIZE)) &&
               ((possible_link_address  & 0x01) != 0) && // in THUMB mode the address will be odd
                (possible_link_address > PLATFORM_CODE_SPACE_START_ADDRESS &&
                 possible_link_address < PLATFORM_CODE_SPACE_END_ADDRESS))
        {
            // We found two acceptable values

            // Store the link address
            *caller_list++ = possible_link_address;

            // Update the book-keeping registers for the next search
            previous_frame_address = possible_frame_address;
            stack_pointer = (uint32_t*)(possible_frame_address + 4);
            stack_entry_counter = 0;
        }
        else
        {
            // Keep iterating through the stack until be find an acceptable combination
            ++stack_pointer;
            ++stack_entry_counter;
        }
    }

}

您需要为您的平台更新#defines。

然后调用以下内容以使用当前调用堆栈填充缓冲区：

uint32_t callers[8];
uint32_t sp_reg;
__ASM volatile ("mov %0, sp" : "=r" (sp_reg) );
backtrace(callers, &callers[8], (uint32_t*)sp_reg);

同样，这是相当hacky，但我发现它工作得很好。缓冲区将填充调用堆栈中每个函数调用的链接地址。

Answer 6

您的可执行文件是否包含调试信息，是否使用-g选项进行编译？我认为这需要在没有帧指针的情况下获得完整的堆栈跟踪。

您可能需要-gdwarf-2以确保它使用包含展开信息的格式。

如何获得调用堆栈回溯？（深入嵌入，没有库支持）

6 个答案:

如何获得调用堆栈回溯？ （深入嵌入，没有库支持）

6 个答案:

如何获得调用堆栈回溯？（深入嵌入，没有库支持）