我想编写一个c程序,用以下代码计算范围a
... c
中的字节数:
char a[16], b[16], c[16];
int counter = 0;
for(i = 0; i < 16; i++)
{
if((a[i] < b[i]) && (b[i] < c[i]))
counter++;
}
return counter;
我打算做这样的事情
__m128i result1 = _mm_cmpgt_epi8 (b, a);
__m128i result2 = _mm_cmplt_epi8 (b, c);
unsigned short out1 = _mm_movemask_epi8(result1);
unsigned short out2 = _mm_movemask_epi8(result2);
unsigned short out3 = out1 & out2;
unsigned short out4 = _mm_popcnt_u32(out3);
我的方法是否正确?有更好的方法吗?
答案 0 :(得分:4)
你的方法看起来很合理。我想你可以通过在SIMD寄存器中执行AND来保存指令,如下所示:
__m128i result1 = _mm_cmpgt_epi8 (b, a);
__m128i result2 = _mm_cmplt_epi8 (b, c);
__m128i mask = _mm_and_si128(result1, result2);
int mask2 = _mm_movemask_epi8(mask);
int counter = _mm_popcnt_u32(mask2);