简短而甜美：

Question

查看以下简单代码：

#include <cmath>

float foo(float in) {
    return sqrtf(in);
}

使用-ffast-math，clang会按预期生成sqrtss。但是，如果我也使用-fstack-protector-all，它将sqrtss更改为rsqrtss，正如您在godbolt所看到的那样。为什么？

Answer 1

简短而甜美：

rsqrtss 更安全，因此准确性和速度较慢。

sqrtss 更快，因此安全性较低。

为什么rsqrtss更安全？

它不使用整个XMM寄存器。

为什么rsqrtss慢一些？

因为它需要更多的寄存器才能执行与 sqrtss。

为什么rsqrtss使用倒数？

在紧要关头，似乎可以更快地计算出平方根的倒数，并且用更少的内存。 Pico-spelenda：很多math。

漫长而痛苦的时刻：

研究

-ffast-math的作用是什么？

-ffast-math
    Enable fast-math mode. This defines the __FAST_MATH__ preprocessor
    macro, and lets the compiler make aggressive, potentially-lossy
    assumptions about floating-point math. These include:

    Floating-point math obeys regular algebraic rules for real numbers (e.g. + and * are associative, x/y == x * (1/y), and (a + b) * c == a * c + b * c),
    operands to floating-point operations are not equal to NaN and Inf, and
    +0 and -0 are interchangeable.

-fstack-protector-all的作用是什么？
- 此答案可以here找到。
- 基本上，它“强制使用所有功能的堆栈保护器”。
什么是“堆栈保护器” ？
- you的好文章。
- 非常简短，非常简洁的火花音是：
  - “堆栈保护器”用于防止利用堆栈覆盖。在gcc和clang中实现的堆栈保护器增加了一个额外的保护每个函数的堆栈区域的变量。
- 值得注意的缺点：
  
  “添加这些检查将导致少许运行时开销：更多堆栈需要空间，但是除了真正的限制之外，可以忽略不计系统...您是否以最大的安全性为代价性能？ -fstack-protector-all适合您。“

什么是sqrtss ？

根据@godbolt：

    Computes the square root of the low single-precision floating-point value
    in the second source operand and stores the single-precision floating-point
    result in the destination operand. The second source operand can be an XMM
    register or a 32-bit memory location. The first source and destination
    operands is an XMM register.

什么是“源操作数” ？
- 可以找到教程here
- 本质上，操作数是计算机中数据的位置。想象一下x + x = y的简单指令，您需要知道什么是x，即源操作数。结果将存储在目标位置“ y”。请注意，如何忘记通常被称为“运算”的“ +”符号，因为在此示例中这无关紧要。
什么是“ XMM寄存器” ？
- 可以找到说明here。
- 这只是特定类型的寄存器。它主要用于浮动数学（令人惊讶的是，这是您要尝试的数学）。

什么是rsqrtss ？

再次，根据@godbolt：

Computes an approximate reciprocal of the square root of the low
single-precision floating-point value in the source operand (second operand)
stores the single-precision floating-point result in the destination operand.
The source operand can be an XMM register or a 32-bit memory location. The
destination operand is an XMM register. The three high-order doublewords of
the destination operand remain unchanged. See Figure 10-6 in the Intel® 64 and
IA-32 Architectures Software Developer’s Manual, Volume 1, for an illustration
of a scalar single-precision floating-point operation.

什么是“双字”？
- 简单的definition。
- 它是计算机内存的度量单位，就像“位”或“字节”一样。但是，与“位”或“字节”不同，它不是通用的，并且取决于计算机的体系结构。
“Intel®64和IA-32体系结构软件开发人员手册，第1卷中的图10-6”是什么样的？
- 您在这里go。

免责声明：这些知识多数来自外部。我刚刚安装了clang来帮助回答您的问题。我不是专家。

如果打开堆栈保护器，为什么clang会生成rsqrt？

1 个答案:

简短而甜美：

漫长而痛苦的时刻：

研究