Question

我尝试为64位iOS应用程序编译一些内联汇编程序。

以下是一个例子：

   int roundff(float _value) {
       int res;
       float temp;
       asm("vcvtr.s32.f32 %[temp], %[value] \n vmov %[res], %[temp]" : [res] "=r" (res), [temp] "=w" (temp) : [value] "w" (_value));
       return res;
   }

我有这个错误：

无法识别的指令助记符。

但是这段代码编译得很好：

__asm__ volatile(
                 "add %[result], %[b], %[a];"
                 : [result] "=r" (result)
                 : [a] "r" (a), [b] "r" (b), [c] "r" (c)
                 );

我发现在aarch64中我必须使用 fcvt 而不是 vcvt 。因为

int a = (int)(10.123);

编译成

fcvtzs w8, s8

但我不知道如何在内联汇编程序中编写它。像这样的东西

int roundff(float _value)
{
    int res;
    asm("fcvtzs %[res], %[value]" : [res] "=r" (res) : [value] "w" (_value));
    return res;
}

也无法正常工作并产生错误：

指令＆＃39; fcvtz＆＃39;不能设置标志，但是＆＃39; s＆＃39;指定后缀。

指令的操作数无效。

另外，我需要 round 而不是 trim 。（ fcvtns ）

有任何帮助吗？在哪里我可以阅读更多关于arm（32/64）asm的内容？

更新好。这：float res = nearbyintf（v）编译成漂亮的指令frinti s0 s0。但是为什么我的内联汇编程序在使用clang编译器的iOS上不起作用？

Answer 1

您可以使用内联到单个ARM指令的标准math.h函数来获得所需的舍入。更好的是，编译器知道它们做了什么，因此可以通过例如更好地优化。证明整数不能为负数，如果是这样的话。

检查godbolt以获取编译器输出：

#include <math.h>

int truncate_f_to_int(float v)
{
  int res = v;  // standard C cast: truncate with fcvtzs on ARM64
  // AMD64: inlines to cvtTss2si rax, xmm0   // Note the extra T for truncate
  return res;
}

int round_f_away_from_zero(float v)
{
    int res = roundf(v);  // optimizes to fcvtas on ARM64
  // AMD64: AND/OR with two constants before converting with truncation
    return res;
}


//#define NOT_ON_GODBOLT
// godbolt has a broken setup and gets x86-64 inline asm for lrintf on ARM64

#if defined(NOT_ON_GODBOLT) || defined (__x86_64__) || defined(__i386__)
int round_f_to_even(float v)
{
  int res =  lrintf(v);  // should inline to a convert using the current rounding mode
  // AMD64: inlines to cvtss2si rax, xmm0
  // nearbyintf(v); // ARM64: calls the math library
  // rintf(v); // ARM64: calls the math library
  return res;
}
#endif

godbolt为非x86架构安装了一个错误的标头：它们仍然使用x86数学标题，包括内联asm。

此外，roundff的{{1}}函数使用内联asm进行fcvtzs编译，只需使用gcc 4.8就可以在godbolt上编译。也许你正在尝试为32位ARM构建？但就像我说的那样，使用你想要的库函数，然后检查以确保它内联到漂亮的ASM。

Answer 2

以下是您的操作方法：

-(int) roundff:(float)a {
    int y;
    __asm__("fcvtzs %w0, %s1\n\t" : "=r"(y) : "w"(a));
    return y;
}

小心，

/ A

在iOS aarch64应用程序中使用内联汇编程序

2 个答案: