Question

最近，当我研究如何在glibc中实现线程局部存储时，我发现了以下代码，它实现了API pthread_key_create()

int
__pthread_key_create (key, destr)
      pthread_key_t *key;
      void (*destr) (void *);
{
    /* Find a slot in __pthread_kyes which is unused.  */
    for (size_t cnt = 0; cnt < PTHREAD_KEYS_MAX; ++cnt)
    {
        uintptr_t seq = __pthread_keys[cnt].seq;

        if (KEY_UNUSED (seq) && KEY_USABLE (seq)
            /* We found an unused slot.  Try to allocate it.  */
            && ! atomic_compare_and_exchange_bool_acq (&__pthread_keys[cnt].seq,
                                                       seq + 1, seq))
        {
            /* Remember the destructor.  */
            __pthread_keys[cnt].destr = destr;

            /* Return the key to the caller.  */
            *key = cnt;

            /* The call succeeded.  */
            return 0;
       }
    }

    return EAGAIN;
}

__pthread_keys是所有线程都可以访问的全局数组。我不明白为什么其成员seq的读取未同步，如下所示：

uintptr_t seq = __pthread_keys[cnt].seq;

虽然稍后修改后会同步。

仅供参考，__pthread_keys是struct pthread_key_struct类型的数组，其定义如下：

/* Thread-local data handling.  */
struct pthread_key_struct
{
    /* Sequence numbers.  Even numbers indicated vacant entries.  Note
       that zero is even.  We use uintptr_t to not require padding on
       32- and 64-bit machines.  On 64-bit machines it helps to avoid
       wrapping, too.  */
    uintptr_t seq;

    /* Destructor for the data.  */
    void (*destr) (void *);
};

提前致谢。

Answer 1

在这种情况下，循环可以避免昂贵的锁定获取。稍后完成的原子compare and swap operation（atomic_compare_and_exchange_bool_acq）将确保只有一个线程可以成功递增序列值并将密钥返回给调用者。在第一步中读取相同值的其他线程将保持循环，因为CAS只能成功执行单个线程。

这是有效的，因为序列值在偶数（空）和奇数（占用）之间交替。将值增加到odd会阻止其他线程获取插槽。

读取该值的周期通常比CAS指令少，因此在进行CAS之前查看该值是有意义的。

有许多wait-free and lock-free algorithms利用CAS指令实现低开销同步。

为什么访问pthread键的序列号在glibc的NPTL实现中不同步？

1 个答案: