Question

我正在尝试在程序集中编写一个简单的if-else语句，但是ret返回start例程而不是预期的test例程。

如何解决这个问题？谢谢。

start:
    ldi r16, 0
    call test
    rjmp start

test:
    cpi r16, 0
    breq doFirst;

    cpi r16, 1
    breq doFirst;

    cpi r16, 2
    breq doSecond;

    jmp test;

doFirst:
    inc r16;
    ret;

doSecond:
    inc r17
    ret;

Answer 1

breq指令不会在堆栈上存储返回地址，因此ret不会返回到程序中的那一点。您需要使用call，icall或rcall AVR说明。

实际上，如果条件为假，更好的解决方案是使用brne来跳过条件代码。 doFirst如果只调用一次，则不需要成为子程序。

您可以尝试使用avr-gcc编译一些if语句，并查看程序集以查看编译器是如何进行的。

Answer 2

ret基本上是pop进入PC。

您跳转到doFirst而不是调用它，因此您应该使用rjmp跳回test内的标签，而不是ret。

或者，如果doFirst是一个单独的函数，那么您对它进行了优化的尾调用，因此它会返回给您的调用者。（就像你（或编译器）如何实现int foo() { return bar(); }）

您无需将doFirst作为单独的功能。您可以将其内联一次，因为您不需要它返回test循环中的两个不同位置。

test:
    cpi    r16, 0
    breq   doFirst0;
    cpi    r16, 1
    breq   doFirst1;
 donefirst:

    cpi    r16, 2
    brne   noSecond   ;  conditionally SKIP the inlined doSecond body
     inc   r17        ; inlined body of doSecond
  noSecond:
    rjmp   test       ; You don't need a slow 4-byte JMP, just a short relative jump

doFirst0:
    inc   r16
doFirst1:          ; fall through instead of jumping back and finding that r16 == 1 now
    inc   r16
    rjmp  donefirst

注意：如果我们将r16 == 0条指令跳转到同一位置，此优化会改变breq的行为。我们只增加一次，然后运行循环的其余部分，而不是递增，然后检查并查找r16 == 1并再次递增。

我们真的可以使用if(r16 <=0) r16 = 2;

来ldi r16, 2

如果它太大而无法复制，并且您确实需要rcall / ret辅助函数，则可以有条件地跳过rcall指令。

使用2个CPI / BREQ对是一种非常低效的方法来测试r16 <= 1（无符号）。 AVR has a BRLO (BRanch if LOwer (unsigned))

由于如果不采用BR指令会更快，我们会将该代码保留在线外，而不是内联它（就像我为doSecond所做的那样）并用BRSH（相同或更高）跳过它。

test:
    cpi    r16, 1
    brlo   doFirst    ; 
 donefirst:           ; r16 will become 2, so we can't put this label later and skip the next check

    cpi    r16, 2
    brne   noSecond   ;  conditionally SKIP the inlined doSecond body
    inc    r17
  noSecond:           ; this label might as well be at test: directly
    rjmp   test

doFirst:         ; rare case, only runs the first 2 iters, put it out of line
    ldi   r16, 2      ; 1 or 2 inc instructions make r16 = 2
    rjmp  donefirst

除非中断处理程序可以修改或使用r16或r17，否则整个循环没有多大意义。如果情况并非如此，那么你真的想要剥离第一次迭代，然后陷入一个无效的无限循环。

当我们使用ret进入你的循环（如果你的r16 <= 2 s去了预期的地方）时，我们最终会在第一次迭代时增加r16 = 2和r17一次。

test:
    cpi    r16, 2
    brhi   above2
    ldi    r16, 2          ; result of 1 or 2 inc instructions
    ; flags may differ from your version.

  infloop_r16_eq_2:        ; loop for the r16==2 case
    inc    r17
    rjmp   infloop_r16_eq_2

  above2:

  infloop:                 ; loop for the r16!=2 case
    rjmp   infloop

但如果寄存器是可以异步修改的全局变量，我们不能只检查r16一次，然后永远继续。

我很好奇gcc会做什么，所以我使用了register volatile unsigned char a asm("r16");和"r17"。令人惊讶的是，这部分有效，尽管它确实警告优化可能会消除对寄存器变量的读取和/或写入。这似乎发生在inc r17与AVR gcc4.6.4 -O3，但不在-O1。 x86-64的gcc7.3实际上保留在-O3。 查看on the Godbolt compiler explorer ，另请参阅How to remove "noise" from GCC/clang assembly output?。

// x86-64 and AVR both have r14 and r15, but not r16/r17
register volatile unsigned char a asm("r14");
register volatile unsigned char b asm("r15");

void test() {
    while(1) {
        if (a <= 1) 
            a++;        // or  a = 2;
        if (a == 2)
            b++;
    }
}

AVR gcc -O1输出：

test:
.L7:
        cpi r16,lo8(2)
        brsh .L2
        subi r16,lo8(-(1))   ; inc r16
.L2:
        cpi r16,lo8(2)
        brne .L7             ; back to the top
         ; else fall through
        subi r17,lo8(-(1))   ; inc r17
        rjmp .L7             ; then back to the top

这看起来很合理。

lo8(2)只是2，我不知道为什么gcc会像这样发出汇编。它可能对符号地址或某些内容很有用，比如lo8(test)来获取标签地址的低字节。

这是一种循环尾部重复优化，但其中一个尾部是空的。因此，我们不是跳过inc r17，而是直接跳到循环的顶部。

brne .L7可以跳转到brsh指令，因为标志仍然是cpi r16, 2设置的。 Gcc没有这样做，因为我们告诉它寄存器是易失性的，因此它不会优化寄存器的第二次读取。

使用Assembly完成简单的条件测试

2 个答案: