如何在Neon内在函数中使用if条件进行并行操作?

时间:2013-12-09 03:27:15

标签: arm simd neon intrinsics

我问了一个关于vclt_s8比较的问题。 Does anybody know how to use Neon intrinsics uint8x8_t vclt_s8 (int8x8_t, int8x8_t)

但是,如果我们有这样的代码:

if(a > b + c) {
    a = b + c;
} else if(a < b - c) {
    a = b - c;
}

如何将其转换为Neon内在函数?在这种情况下,似乎我们不能做8个操作员并行操作。不是吗?

1 个答案:

答案 0 :(得分:5)

显然你不能用SIMD进行分支,所以你必须看看如何使用掩码以无分支的方式实现这种逻辑。我只是给出伪代码,所以你得到了一般的想法 - 编码这应该是相当简单的:

bc = b + c       ; get `(b + c)` in a vector register
mask = a > bc    ; use compare instruction to generate mask (-1 = true, 0 = false)
bc = bc & mask   ; use bitwise AND to zero out elements of `(b + c)` which we do not want
a = a & ~mask    ; use bitwise ANDC to zero out elements of `a` which we do not want
a = a | bc       ; combine required elements into `a` using bitwise OR

bc = b - c       ; get `(b - c)` in a vector register
mask = a < bc    ; use compare instruction to generate mask (-1 = true, 0 = false)
bc = bc & mask   ; use bitwise AND to zero out elements of `(b - c)` which we do not want
a = a & ~mask    ; use bitwise ANDC to zero out elements of `a` which we do not want
a = a | bc       ; combine required elements into `a` using bitwise OR

请注意,我在这里作了一点欺骗,并从标量代码中省略了else(假设两个分支是互斥的),所以我实现的实际上相当于:

if (a > b + c) {
    a = b + c;
}
if (a < b - c) {
    a = b - c;
}

如果这是一个不好的假设,那么你需要做一些额外的按位操作来实现逻辑else