Question

请查看以下代码：

#include <pthread.h>
#include <boost/atomic.hpp>

class ReferenceCounted {
  public:
    ReferenceCounted() : ref_count_(1) {}

    void reserve() {
      ref_count_.fetch_add(1, boost::memory_order_relaxed);
    }

    void release() {
      if (ref_count_.fetch_sub(1, boost::memory_order_release) == 1) {
        boost::atomic_thread_fence(boost::memory_order_acquire);
        delete this;
      }
    }

  private:
    boost::atomic<int> ref_count_;
};

void* Thread1(void* x) {
  static_cast<ReferenceCounted*>(x)->release();
  return NULL;
}

void* Thread2(void* x) {
  static_cast<ReferenceCounted*>(x)->release();
  return NULL;
}

int main() {
  ReferenceCounted* obj = new ReferenceCounted();
  obj->reserve(); // for Thread1
  obj->reserve(); // for Thread2
  obj->release(); // for the main()
  pthread_t t[2];
  pthread_create(&t[0], NULL, Thread1, obj);
  pthread_create(&t[1], NULL, Thread2, obj);
  pthread_join(t[0], NULL);
  pthread_join(t[1], NULL);
}

这有点类似于Reference counting中的Boost.Atomic示例。

主要区别在于嵌入式ref_count_在构造函数中被初始化为1（一旦构造函数完成，我们对ReferenceCounted对象有一个引用）并且代码没有使用boost::intrusive_ptr。

请不要责怪我在代码中使用delete this - 这是我在工作中的大型代码库中的模式，现在我无能为力

现在，使用来自trunk的clang 3.5编译的代码（下面的详细信息）和ThreadSanitizer（tsan v2）会导致ThreadSanitizer的以下输出：

WARNING: ThreadSanitizer: data race (pid=9871)
  Write of size 1 at 0x7d040000f7f0 by thread T2:
    #0 operator delete(void*) <null>:0 (a.out+0x00000004738b)
    #1 ReferenceCounted::release() /home/A.Romanek/tmp/tsan/main.cpp:15 (a.out+0x0000000a2c06)
    #2 Thread2(void*) /home/A.Romanek/tmp/tsan/main.cpp:29 (a.out+0x0000000a2833)

  Previous atomic write of size 4 at 0x7d040000f7f0 by thread T1:
    #0 __tsan_atomic32_fetch_sub <null>:0 (a.out+0x0000000896b6)
    #1 boost::atomics::detail::base_atomic<int, int, 4u, true>::fetch_sub(int, boost::memory_order) volatile /home/A.Romanek/tmp/boost/boost_1_55_0/boost/atomic/detail/gcc-atomic.hpp:499 (a.out+0x0000000a3329)
    #2 ReferenceCounted::release() /home/A.Romanek/tmp/tsan/main.cpp:13 (a.out+0x0000000a2a71)
    #3 Thread1(void*) /home/A.Romanek/tmp/tsan/main.cpp:24 (a.out+0x0000000a27d3)

  Location is heap block of size 4 at 0x7d040000f7f0 allocated by main thread:
    #0 operator new(unsigned long) <null>:0 (a.out+0x000000046e1d)
    #1 main /home/A.Romanek/tmp/tsan/main.cpp:34 (a.out+0x0000000a286f)

  Thread T2 (tid=9874, running) created by main thread at:
    #0 pthread_create <null>:0 (a.out+0x00000004a2d1)
    #1 main /home/A.Romanek/tmp/tsan/main.cpp:40 (a.out+0x0000000a294e)

  Thread T1 (tid=9873, finished) created by main thread at:
    #0 pthread_create <null>:0 (a.out+0x00000004a2d1)
    #1 main /home/A.Romanek/tmp/tsan/main.cpp:39 (a.out+0x0000000a2912)

SUMMARY: ThreadSanitizer: data race ??:0 operator delete(void*)
==================
ThreadSanitizer: reported 1 warnings

奇怪的是，thread T1在引用计数器上进行原子递减时，将{1}写入与thread T2相同的内存位置。

前者如何解释？它是由ReferenceCounted类的析构函数执行的清理吗？

这是假阳性？或者代码错了？

我的设置是：

$ uname -a
Linux aromanek-laptop 3.13.0-29-generic #53-Ubuntu SMP Wed Jun 4 21:00:20 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux

$ clang --version
Ubuntu clang version 3.5-1ubuntu1 (trunk) (based on LLVM 3.5)
Target: x86_64-pc-linux-gnu
Thread model: posix

代码编译如下：

clang++ main.cpp -I/home/A.Romanek/tmp/boost/boost_1_55_0 -pthread -fsanitize=thread -O0 -g -ggdb3 -fPIE -pie -fPIC

请注意，在我的计算机上，boost::atomic<T>的实施解析为__atomic_load_n系列函数，ThreadSanitizer claims to understand。

更新1：使用clang 3.4最终版本时会发生同样的情况。

更新2：-std=c++11和<atomic>同时出现libstdc++和libc++同样的问题。

Answer 1

这看起来像是误报。

thread_fence方法中的release()强制执行fetch_sub的所有未完成的写入 - 在围栏返回之前发生的调用。因此，下一行的delete无法通过减少引用计数与先前的写入进行竞争。

引用本书 C ++ Concurrency in Action ：

发布操作与具有order的围栏同步 std::memory_order_acquire [...]如果该释放操作存储了一个由栅栏上的原子操作读取的值与围栏相同的线程。

由于减少refcount是一个读 - 修改 - 写操作，这应该适用于此。

详细说明，我们需要确保的操作顺序如下：

将refcount减少到值＆gt; 1

将refcount减少到1

删除对象

2.和3.是隐式同步的，因为它们发生在同一个线程上。 1.和2.是同步的，因为它们都是对相同值的原子读 - 修改 - 写操作。如果这两个人竞争，整个引用计数将首先被打破。那么剩下的就是同步1.和3.。

这正是栅栏的作用。来自1.的写操作是release操作，正如我们刚才讨论的那样，它与2.同步，读取相同的值。 3.，与acquire在同一个帖子上的2.围栏，现在与规范保证的1.的写入同步。发生这种情况时无需在对象上添加acquire写入（正如@KerrekSB在评论中所建议的那样），这也可以起作用，但由于额外的写入，可能效率较低。

底线：不要玩内存排序。即使专家也弄错了，他们对绩效的影响往往可以忽略不计。因此，除非您在分析运行中证明它们会破坏您的性能并且您绝对肯定必须优化它，只是假装它们不存在并坚持使用默认的memory_order_seq_cst。

Answer 2

在撰写本文时（2018年3月）@adam-romanek's comment，仅为其他偶然发现ThreadSanitizer does not support standalone memory fences的人强调这一点。这是ThreadSanitizer FAQ中提到的，不明确提到支持围栏：

问：支持哪些同步原语？ TSan支持pthread同步原语，内置编译器原子操作（sync / atomic），llvm libc ++支持C ++操作（不是非常直接[原文如此]虽然测试过。

使用嵌入式引用计数器时，ThreadSanitizer报告“操作员删除时的数据竞争（void *）”

2 个答案: