Question

有谁知道如何更改此功能以处理64位？

{
  unsigned int prev;

  __asm__ __volatile__ (
          " lock; cmpxchgl %1,%2; "
          : "=a"(prev)
          : "q"(new_value), "m"(*(int *)ptr), "0"(old_value)
          : "memory");

  return prev;
}

由Brett Hale亲切建议使用unsigned long prev;和cmpxchgq代替cmpxchgl导致这些错误：

include/cs.h: Assembler messages:
include/cs.h:26: Error: incorrect register `%esi' used with `q'    suffix
include/cs.h:26: Error: incorrect register `%esi' used with `q' suffix
include/cs.h:26: Error: incorrect register `%esi' used with `q' suffix
include/cs.h:26: Error: incorrect register `%r13d' used with `q' suffix
error: command 'gcc' failed with exit status 1

我想我找到了Brett的建议不适合我的原因。我不得不将函数输入中的变量类型从int更改为long。为了完整起见，我在这里添加：

#ifndef __cs__include
#define __cs__include

static inline unsigned int CS(volatile void *ptr,
                              unsigned long old_value, /* was int */
                              unsigned long new_value) /* was int too */
{
  unsigned long prev; /* result */
  volatile unsigned long *vptr = (volatile unsigned long *) ptr;

  __asm__ __volatile__ (

          " lock; cmpxchgq %2, %1; "
          : "=a" (prev), "+m" (*vptr)
          : "r" (new_value), "0" (old_value)
          : "memory");

  return prev;
}

代码编译没有错误（尽管有很多警告）。但是，遗憾的是，该程序仍无法在64位上运行。

Answer 1

内置版本（带__sync样式）如下所示：

#include <stdint.h>
#include <stdio.h>

uint64_t cas(uint64_t* ptr, uint64_t old_value, uint64_t new_value)
{
    return __sync_val_compare_and_swap(ptr, old_value, new_value);
}

int main()
{
    uint64_t foo = 42;
    uint64_t old = cas(&foo, 42, 1);
    printf("foo=%llu old=%llu\n", (unsigned long long)foo, (unsigned long long)old);
    return 0;
}

这样做的好处在于它适用于许多架构。在x86上，它在32位模式下使用cmpxchg8b，在64位模式下使用cmpxchgq。

您的问题不是很清楚，也许您打算在编译64位模式时保持32位操作。在这种情况下，请使用uint32_t而不是uint64_t。

Answer 2

首先，由于我们可能正在处理LP64数据模型，因此对于64位数量使用：unsigned long prev;。然后用cmpxchgl（64位指令）替换cmpxchgq。

操作数%1有"q"约束，限制了%eax，%ebx，%ecx或%edx的选择与IA32。该限制不适用于x86-64。根据{{3}}，我们应该可以将其保留为"q"，但最好用"r"进行描述。
操作数%2是一个内存输入，也应该提升为unsigned long。它也可能被标记为volatile fetch。这可以防止编译器决定它何时更愿意事先获取/更新内存。
操作数%3仅表示%rax是输入和输出 - 它不需要更改。

{
  unsigned long prev; /* result */
  volatile unsigned long *vptr = (volatile unsigned long *) ptr;

  __asm__ __volatile__ (

          " lock; cmpxchgq %1,%2; "
          : "=a"(prev)
          : "r"(new_value), "m"(*vptr), "0"(old_value)
          : "memory");

  return prev;
}

但是，在输入约束中使用"m"在技术上并不正确，因为它可以通过cmpxchgq指令进行更新。 gcc 文档中存在一个长期存在的错误，现已更正，其中指出"+m"（内存同时是输入和输出）是不允许的。更正确的表达是：

  __asm__ __volatile__ (

          " lock; cmpxchgq %2, %1; " /* notice reversed operand order! */
          : "=a" (prev), "+m" (*vptr)
          : "r" (new_value), "0" (old_value)
          : "memory");

将汇编内联从32位转换为64位

2 个答案: