Question

我正在运行x86，我实际上希望看到由我的计算机上的乱序执行引起的错误。我尝试写一个based off this wiki article，但我总是看到“ x的值为33”：

#include<stdio.h>
#include<pthread.h>
#include <sys/types.h>

int x, f;

void *handler(void *ptr) { 
  while (f == 0); 
  // Expectation: Sometimes, this should print 11 due to out-of-order exec
  printf("value of x is %d \n", x);
  return NULL;
}

int main() {
     pthread_t thread1;
     while(1) {
       x = 11; f = 0;
       pthread_create(&thread1, NULL, handler, NULL);
       x = 33; 
       f = 1;
       pthread_join(thread1, NULL);
     }   
     return 0;
}

能说明无序执行错误的最简单的c程序是什么？为什么有时不显示“ x的值为11”？

Answer 1

您要创建的效果不依赖于乱序的执行。那只是可以创建内存重新排序的事情之一。另外，现代的x86会无序执行，但使用其内存顺序缓冲区来确保存储提交到L1d /在程序顺序中全局可见。（因为x86的内存模型仅允许StoreLoad重新排序，而不允许StoreStore。）

内存重排序与指令执行重排序是分开的，因为即使有序CPU也会使用存储缓冲区来避免在高速缓存未命中的存储上停顿。

Out-of-order instruction execution: is commit order preserved?

Are loads and stores the only instructions that gets reordered?

如果x和f最终位于不同的缓存行中，则有序ARM CPU上的C实现可以打印11或33。

我假设您在禁用优化的情况下进行了编译，因此您的编译器有效地对待了所有变量volatile ，即volatile int x,f。否则，while(f==0);循环将编译为if(f==0) { infloop; }，仅检查一次f。（针对非原子变量的数据竞争UB是允许编译器将负载提升到循环外的方法，但是volatile负载必须始终完成。https://electronics.stackexchange.com/questions/387181/mcu-programming-c-o2-optimization-breaks-while-loop#387478）。

结果asm /机器代码中的存储将以C源代码顺序显示。

您正在为具有强大内存模型的x86进行编译：x86存储是发布存储，x86负载是获取负载。您不会获得顺序一致性，但是会免费获得acq_rel。（对于未优化的代码，即使您不要求它也会发生。）

因此，在不针对x86进行优化的情况下进行编译时，您的程序等效于

_Atomic int x, f;

int main(){
    ...
    pthread_create
    atomic_store_explicit(&x, 33, memory_order_release);
    atomic_store_explicit(&f, 1, memory_order_release);
    ...
}

在负载端也是如此。 while(f==0){}是x86上的一个获取负载，因此让读取端等待它看到非零的f才能保证它也看到x==33。

但是，如果您为弱排序的ISA（如ARM或PowerPC）进行编译，则那里的asm级内存排序保证确实允许StoreStore和LoadLoad重排序，因此，如果您的程序可以打印11无需优化即可编译。

另请参阅https://preshing.com/20120930/weak-vs-strong-memory-models/

简单的C程序来说明乱序执行？

1 个答案: