Question

我正在尝试使用一个非常简单的AVX-512收集指令示例：

double __attribute__((aligned(64))) array3[17] = {1.0,  2.0,  3.0,  4.0,  5.0,  6.0,  7.0,  8.0,
                     9.0, 10.0, 11.0, 12.0, 13.0, 14.0, 15.0, 16.0,
                    17.0};
int __attribute__((aligned(64))) i_index_ar[16] = {1,  2,  3,  4,  5,  6,  7,  8, 9, 10, 11, 12, 13, 14, 15, 16};
__m512i i_index = _mm512_load_epi64(i_index_ar);
__m512d a7AVX = _mm512_i64gather_pd(i_index, &array3[0], 1);

不幸的是，我对_mm512_i64gather_pd的最后一次调用导致内存访问错误（内存已转储）。

德语错误消息：Speicherzugriffsfehler (Speicherabzug geschrieben)

我正在使用Intel Xeon Phi（KNL）7210。

edit：这里的错误是，我使用32位整数和64位加载指令，并且_mm512_i64gather_pd中的 scale 必须为8或sizeof(double)。

Answer 1

我认为您需要将scale设置为sizeof(double)，而不是1。

更改：

__m512d a7AVX = _mm512_i64gather_pd(i_index, &array3[0], 1);

收件人：

__m512d a7AVX = _mm512_i64gather_pd(i_index, &array3[0], sizeof(double));

另请参阅：this question及其答案，以更全面地说明Intel SIMD收集的负载及其用法。

—

另一个问题：您的索引必须是64位整数，因此请更改：

int __attribute__((aligned(64))) i_index_ar[16] = {1,  2,  3,  4,  5,  6,  7,  8, 9, ...

收件人：

int64_t __attribute__((aligned(64))) i_index_ar[16] = {1,  2,  3,  4,  5,  6,  7,  8, 9, ...

_mm512_i64gather_pd（）的内存访问错误

1 个答案: