Question

我们有一个要与AVX2一起编译的翻译单元（只有那个）：它告诉GCC前行，文件中的第一行：

#pragma GCC target "arch=core-avx2,tune=core-avx2"

这曾经与GCC 4.8和4.9一起使用，但是从6开始（也尝试7和8），我们会收到此警告（我们将其视为错误）：

error: SSE instruction set disabled, using 387 arithmetics

在第一个函数上返回浮点数。我试图像这样启用SSE 4.2（以及avx和avx2）。

#pragma GCC target "sse4.2,arch=core-avx2,tune=core-avx2"

但这还不够，错误仍然存在。

编辑：

相关的编译器标志，我们将AVX用于大多数内容：

-mfpmath=sse,387 -march=corei7-avx -mtune=corei7-avx

EDIT2：最少样本：

#pragma GCC target "arch=core-avx2,tune=core-avx2"

#include <immintrin.h>
#include <math.h>

static inline float
lg1pf( float x ) {
    return log1pf(x)*1.44269504088896338700465f;
}

int main()
{
  log1pf(2.0f);
}

那样编译：

gcc -o test test.c -O2 -Wall -Werror -pedantic -std=c99 -mfpmath=sse,387 -march=corei7-avx -mtune=corei7-avx

In file included from /home/xxx/gcc-7.1.0/lib/gcc/x86_64-pc-linux-gnu/7.1.0/include/immintrin.h:45:0,
                 from test.c:3:
/home/xxx/gcc-7.1.0/lib/gcc/x86_64-pc-linux-gnu/7.1.0/include/avx512fintrin.h: In function ‘_mm_add_round_sd’:
/home/xxx/gcc-7.1.0/lib/gcc/x86_64-pc-linux-gnu/7.1.0/include/avx512fintrin.h:1412:1: error: SSE register return with SSE disabled
 {
 ^

潜在的解决方案

#pragma GCC target "avx2"

为我工作，无需对代码进行其他更改。将属性应用于单个功能也不起作用：

相关问题：

__attribute__((__target__("arch=broadwell")))  // does not compile
__m256 use_avx(__m256 a) { return _mm256_add_ps(a,a); }

__attribute__((__target__("avx2,arch=broadwell"))) // does not compile
__m256 use_avx(__m256 a) { return _mm256_add_ps(a,a); }

__attribute__((__target__("avx2"))) // compiles
__m256 use_avx(__m256 a) { return _mm256_add_ps(a,a); }

Answer 1

这看起来像个错误。 #pragma GCC target之前的#include <immintrin.h>以某种方式破坏了标头，IDK为何如此。即使在命令行中使用-march=haswell启用了AVX2，＃pragma似乎也破坏了此后定义的任何内部函数的内联。

您可以在标题后使用#pragma ，但是使用在命令行上未启用的内部指令将失败。

即使更现代的目标名称（如#pragma GCC target "arch=haswell"也会导致错误，所以并不是像corei7-avx这样的旧的模糊目标名称通常都会被破坏。他们仍然在命令行上工作。如果要为整个文件启用某些功能，则标准方法是使用编译器选项和 not 编译指示。

尽管如此，GCC确实声称可以通过功能或__attribute__支持每个功能的目标选项。 https://gcc.gnu.org/onlinedocs/gcc/Function-Specific-Option-Pragmas.html。

据我所知（Godbolt compiler explorer with gcc8.1）。 Clang不受影响，因为它忽略了#pragma GCC target。（因此，这意味着#pragma的移植性不是很强；您可能希望代码可以与任何GNU C编译器一起使用，而不仅仅是gcc本身。）

 // breaks gcc when before immintrin.h
 // #pragma GCC target "arch=haswell"

#include <immintrin.h>
#include <math.h>

//#pragma GCC target "arch=core-avx2,tune=core-avx2"
#pragma GCC target "arch=haswell"

//static inline 
float
lg1pf( float x ) {
    return log1pf(x)*1.44269504088896338700465f;
}

// can accept / return wide vectors
__m128 nop(__m128 a) { return a; }
__m256 require_avx(__m256 a) { return a; }
// but error on using intrinsics if #include happened without target options
//__m256 use_avx(__m256 a) { return _mm256_add_ps(a,a); }

// this works, though, because AVX is enabled at this point
// presumably so would  __builtin_ia32_whatever
// Without `arch=haswell`, this breaks, so we know the pragma "worked"
__m256 use_native_vec(__m256 a) { return a+a; }

禁用SSE指令集的AVX2的gcc目标

1 个答案: