Question

在float中，floor()和int()似乎很容易，例如：

float z = floor(LOG2EF * x + 0.5f);
const int32_t n = int32_t(z);

成为：

__m128 z = _mm_add_ps(_mm_mul_ps(log2ef, x), half);
__m128 t = _mm_cvtepi32_ps(_mm_cvttps_epi32(z));
z = _mm_sub_ps(t, _mm_and_ps(_mm_cmplt_ps(z, t), one));

__m128i n = _mm_cvtps_epi32(z);

但是您将如何仅使用 SSE2在double中实现此目标？

这是我要转换的双重版本：

double z = floor(LOG2E * x + 0.5);
const int32_t n = int32_t(z);

Answer 1

只需使用与单精度（...pd...）固有的双精度等效（...ps...）：

__m128i n = _mm_cvtpd_epi32(z);

根据《英特尔内部指南》，该内部确实适用于SSE2：https://software.intel.com/sites/landingpage/IntrinsicsGuide/#expand=4966,1917&techs=SSE2

__m128i _mm_cvtpd_epi32 (__m128d a)

将a中的压缩双精度（64位）浮点元素转换为压缩32位整数，并将结果存储在dst中。
FOR j := 0 to 1
  i := 32*j
  k := 64*j
  dst[i+31:i] := Convert_FP64_To_Int32(a[k+63:k])
ENDFOR

如何仅使用SSE2进行翻倍/整数转换？

1 个答案: