我正在尝试使用扩展程序集来划分位于4float向量结构中的32位浮点数组。
这是编译器错误:
sisd.c:在函数'divSISD'中:
sisd.c:182:9: error: ‘asm’ operand has impossible constraints
asm(
^~~
代码:
void divSISD(Vector4f* vecA, Vector4f* vecB, Vector4f* result, int size){
for(int i = 0; i < size; i++){
asm(
"fld %4\n\t"
"fdiv %5\n\t"
"fstp %0\n\t"
"fld %6\n\t"
"fdiv %7\n\t"
"fstp %1\n\t"
"fld %8\n\t"
"fdiv %9\n\t"
"fstp %2\n\t"
"fld %10\n\t"
"fdiv %11\n\t"
"fstp %3\n\t"
: "=m" (result[i].a),
"=m" (result[i].b),
"=m" (result[i].c),
"=m" (result[i].d)
: "m" (vecA[i].a), "m" (vecB[i].a),
"m" (vecA[i].b), "m" (vecB[i].b),
"m" (vecA[i].c), "m" (vecB[i].c),
"m" (vecA[i].d), "m" (vecB[i].d)
);
}
}
如果我使用非指针结构类型,这似乎工作正常,如下所示:
void divSISD(Vector4f vecA, Vector4f vecB, Vector4f result, int size){
asm(
"fld %4\n\t"
"fdiv %5\n\t"
"fstp %0\n\t"
"fld %6\n\t"
"fdiv %7\n\t"
"fstp %1\n\t"
"fld %8\n\t"
"fdiv %9\n\t"
"fstp %2\n\t"
"fld %10\n\t"
"fdiv %11\n\t"
"fstp %3\n\t"
: "=m" (result.a),
"=m" (result.b),
"=m" (result.c),
"=m" (result.d)
: "m" (vecA.a), "m" (vecB.a),
"m" (vecA.b), "m" (vecB.b),
"m" (vecA.c), "m" (vecB.c),
"m" (vecA.d), "m" (vecB.d)
);
}
我无法理解为什么这不起作用,因为“m”约束应该适用于这两种情况。