Question

为什么下面的代码无法输出Hello World！它与CPU缓存有关吗？但是我认为CPU应该保证缓存的一致性，对吗？在thread_fun2修改值之后，thread_fun是否应该从内存刷新缓存。我知道atomic可以解决此问题，但是我不知道为什么下面的代码不起作用。

#include <stdio.h>
#include <thread>
int a = 4;

void thread_fun() {
    while(a!=3) {

    }
    printf("Hello world!\n");
}
void thread_fun2() {
    a=3;
    printf("Set!\n");
}


int main()  {
    auto tid=std::thread(thread_fun);
    auto tid2=std::thread(thread_fun2);
    tid.join();
    tid2.join();
}

构建选项：

g++ -o multi multi.cc -O3 -std=c++11 -lpthread

下面是gdb输出

(gdb) disass thread_fun
Dump of assembler code for function _Z10thread_funv:
   0x0000000000400af0 <+0>:     cmpl   $0x3,0x201599(%rip)        # 0x602090 <a>
   0x0000000000400af7 <+7>:     je     0x400b00 <_Z10thread_funv+16>
   0x0000000000400af9 <+9>:     jmp    0x400af9 <_Z10thread_funv+9>
   0x0000000000400afb <+11>:    nopl   0x0(%rax,%rax,1)
   0x0000000000400b00 <+16>:    mov    $0x401090,%edi
   0x0000000000400b05 <+21>:    jmpq   0x4008f0 <puts@plt>
End of assembler dump.
(gdb) disass thread_fun2
Dump of assembler code for function _Z11thread_fun2v:
   0x0000000000400b10 <+0>:     mov    $0x40109d,%edi
   0x0000000000400b15 <+5>:     movl   $0x3,0x201571(%rip)        # 0x602090 <a>
   0x0000000000400b1f <+15>:    jmpq   0x4008f0 <puts@plt>
End of assembler dump.
(gdb)

测试输出

[root@centos-test tmp]# ./multi 
Set!
^C
[root@centos-test tmp]# ./multi 
Set!
^C
[root@centos-test tmp]# ./multi 
Set!
^C
[root@centos-test tmp]# ./multi 
Set!
^C
[root@centos-test tmp]# ./multi 
Set!
^C

更新：谢谢大家，现在我发现这个问题实际上是由编译器引起的。

(gdb) disass thread_fun
Dump of assembler code for function _Z10thread_funv:
   0x0000000000400af0 <+0>:     cmpl   $0x3,0x201599(%rip)        # 0x602090 <a>
   0x0000000000400af7 <+7>:     je     0x400b00 <_Z10thread_funv+16>
   0x0000000000400af9 <+9>:     jmp    0x400af9 <_Z10thread_funv+9>  ###jump to itself
   0x0000000000400afb <+11>:    nopl   0x0(%rax,%rax,1)
   0x0000000000400b00 <+16>:    mov    $0x401090,%edi
   0x0000000000400b05 <+21>:    jmpq   0x4008f0 <puts@plt>
End of assembler dump.

似乎编译器将其视为单线程应用程序。

Answer 1

问题是该标准说允许编译器优化您的代码，因为它是无数据争用的（不是直接引号！）。

所以当它分析

while(a!=3) {

}

它发现它需要检查a!=3，直到下次重复该循环之前，什么都没有发生，因此无需再次检查a，因为它没有改变。

因此，将a的类型更改为std::atomic<int>将迫使其再次检查a的值，并且循环应按预期工作。

Answer 2

正式的解释是不允许您在多个线程中读写访问非原子变量。这称为数据竞赛，它会触发未定义的行为。

因为不允许这样做，所以不需要编译器将存储提交到a的L1缓存中，因此它对于其他线程仍然不可见。使用-O3优化进行编译时，您会在代码中看到此效果。

正如您所说，解决方案是将a更改为std::atomic<int>（一种无数据争用的类型），然后就一切就绪了。

Answer 3

您想要做的是std :: condition_variable

的典型用例

#include <stdio.h>
#include <thread>
#include <mutex>
#include <condition_variable>
std::mutex m;
std::condition_variable cv;
int a = 4;

void thread_fun() {
    std::unique_lock<std::mutex> lk(m);
    cv.wait(lk, []{return a == 3;});
    printf("Hello world!\n");
}

void thread_fun2() {
    std::lock_guard<std::mutex> lk(cv_m);
    a = 3;
    printf("Set!\n");
}

int main()  {
    auto tid=std::thread(thread_fun);
    auto tid2=std::thread(thread_fun2);
    tid.join();
    tid2.join();
}

请注意，使用lock_guard和unique_lock有助于使用互斥锁m在线程1和线程2之间进行同步。

为什么下面的代码无法输出Hello World？

3 个答案: