扩展装配,浮点除法

时间:2018-04-18 22:39:27

标签: c assembly x86 sse fpu

我正在尝试使用扩展程序集来划分位于4float向量结构中的32位浮点数组。

这是编译器错误:

sisd.c:在函数'divSISD'中:

sisd.c:182:9: error: ‘asm’ operand has impossible constraints
     asm(
     ^~~

代码:

void divSISD(Vector4f* vecA, Vector4f* vecB, Vector4f* result, int size){


   for(int i = 0; i < size; i++){

    asm( 

        "fld %4\n\t"
        "fdiv %5\n\t"
        "fstp %0\n\t"

        "fld %6\n\t"
        "fdiv %7\n\t"
        "fstp %1\n\t"

        "fld %8\n\t"
        "fdiv %9\n\t"
        "fstp %2\n\t"

        "fld %10\n\t"
        "fdiv %11\n\t"
        "fstp %3\n\t"

:       "=m" (result[i].a), 
        "=m" (result[i].b), 
        "=m" (result[i].c),
        "=m" (result[i].d)  
:       "m" (vecA[i].a), "m" (vecB[i].a), 
        "m" (vecA[i].b), "m" (vecB[i].b), 
        "m" (vecA[i].c), "m" (vecB[i].c), 
        "m" (vecA[i].d), "m" (vecB[i].d) 

);
}
}

如果我使用非指针结构类型,这似乎工作正常,如下所示:

void divSISD(Vector4f vecA, Vector4f vecB, Vector4f result, int size){
    asm( 

        "fld %4\n\t"
        "fdiv %5\n\t"
        "fstp %0\n\t"

        "fld %6\n\t"
        "fdiv %7\n\t"
        "fstp %1\n\t"

        "fld %8\n\t"
        "fdiv %9\n\t"
        "fstp %2\n\t"

        "fld %10\n\t"
        "fdiv %11\n\t"
        "fstp %3\n\t"


 :      "=m" (result.a), 
        "=m" (result.b), 
        "=m" (result.c),
        "=m" (result.d)  
 :      "m" (vecA.a), "m" (vecB.a), 
        "m" (vecA.b), "m" (vecB.b), 
        "m" (vecA.c), "m" (vecB.c), 
        "m" (vecA.d), "m" (vecB.d) 

        );
}

我无法理解为什么这不起作用,因为“m”约束应该适用于这两种情况。

0 个答案:

没有答案