Question

问题的简短版本：我有2个共享同一个数组的函数，当一个编辑它时，另一个是读它。但是，向量很长（5000个样本），很少发生并发访问。但MUTEX1上的Mutex争用正在减慢该计划的速度。＆＃39;

如何锁定内存的某些位置而不是整个块以减少争用？

编辑：注意：我必须尽可能使用更新的G值。

EDIT2：例如，我有长度为5000的数组G. foo1锁定mutex1来编辑索引124.虽然foo2想要编辑索引2349，但它不能直到{{1} } release foo1。

有没有办法可以将锁定互斥锁的争用转移到元素级别？意思是：我希望mutex1和foo2仅在同一个互斥锁上竞争，只有当他们想要编辑相同的索引时。例如：foo1想要编辑索引3156，而foo1想要编辑索引3156.

带代码说明的长版本：我正在编写一个复杂的数学函数的代码，我正在使用pthreads来并行代码并提高性能。代码非常复杂，我可以发布它，但我可以将模型发布到代码中。

基本上我有2个数组，我想用2个并行运行的线程编辑。一个线程运行foo2，另一个运行foo1。但是，它们应按特定顺序运行，并使用foo2 es（mutex，_B和_A1）来控制序列。它如下：

_A2

然后我会检索我的结果。在

foo1 (first half)
foo2 (first half) and foo1 (second half) (in parallel)
foo1 (first half) and foo2 (second half) (in parallel)
...
foo2(second half)

的前半部分，我将使用foo1中可能同时由G1编辑的结果。因此，我使用foo2来保护它。同样发生在Mutex1 foo2。但是，将完整向量锁定为1值非常有效，它们几乎不会同时编辑相同的内存位置。当我比较结果时，它几乎总是一样的。我想要一种方法一次锁定一个元素，这样它们只会对相同的元素进行竞争。

我将为有兴趣知道它如何工作的人描述代码：

到目前为止，它是定义和线程调用函数，请记住我定义#include <pthread.h> #include <iostream> using namespace std; #define numThreads 2 #define Length 10000 pthread_t threads[numThreads]; pthread_mutex_t mutex1 = PTHREAD_MUTEX_INITIALIZER; pthread_mutex_t Mutex_B = PTHREAD_MUTEX_INITIALIZER; pthread_mutex_t Mutex_A1 = PTHREAD_MUTEX_INITIALIZER; pthread_mutex_t Mutex_A2 = PTHREAD_MUTEX_INITIALIZER; struct data_pointers { double *A; double *B; double *G; double *L; int idxThread; }; void foo1 (data_pointers &data); void foo2 (data_pointers &data); void *thread_func(void *arg){ data_pointers data = *((data_pointers *) arg); if (data.idxThread==0) foo1 (data); else foo2 (data); }和Length 10000

numThreads 2

在foo1中我锁定void foo1 ( data_pointers &data) { double *A = data.A; double *L = data.L; double *G = data.G; double U; for (int ijk =0;ijk<5;ijk++){ /* here goes some definitions*/ pthread_mutex_lock(&Mutex_A1); for (int k =0;k<Length;k++){ pthread_mutex_lock(&mutex1); U = G[k]; pthread_mutex_unlock(&mutex1); /*U undergoes a lot of mathematical operations here */ } pthread_mutex_lock(&Mutex_B); pthread_mutex_unlock(&Mutex_A2); for (int k =0;k<Length;k++){ /*U another mathematical operations here */ pthread_mutex_lock(&mutex1); L[k] = U; pthread_mutex_unlock(&mutex1); pthread_mutex_unlock(&Mutex_B); } } }并完成我的工作，然后锁定mutexA1并解锁MutexB，以便MutexA2开始工作。请注意，foo2首先锁定main。这样我保证MutexA2在foo1被锁定的情况下开始下半场，这样，mutexB无法进入函数的后半部分，直到foo2解锁foo1

mutexB

现在，当void foo2 (data_pointers &data) { double *A = data.A; double *L = data.L; double *G = data.G; double U; for (int ijk =0;ijk<5;ijk++){ /* here goes some definitions*/ pthread_mutex_lock(&Mutex_A1); for (int k =0;k<Length;k++){ pthread_mutex_lock(&mutex1); U = G[k]; pthread_mutex_unlock(&mutex1); /*U undergoes a lot of mathematical operations here */ } pthread_mutex_lock(&Mutex_B); pthread_mutex_unlock(&Mutex_A2); for (int k =0;k<Length;k++){ /*U another mathematical operations here */ pthread_mutex_lock(&mutex1); L[k] = U; pthread_mutex_unlock(&mutex1); pthread_mutex_unlock(&Mutex_B); } } }解锁foo1时，它必须等待mutexB解锁foo2以便它可以正常工作，mutexA1只会解锁foo2 1}}当它已经解锁mutexA2时。

这种情况持续了5次。

mutexB

请注意，这只是一个示例代码。编译并按预期工作，但没有输出。

最后编辑：谢谢大家的好主意，我有很多经验，并乐于遵循这些建议。我将对所有答案进行投票，因为它们很有用，并选择最接近原始问题（原子性）

Answer 1

如果不调整数组大小，则不需要在单个元素或整个数组上使用任何互斥锁。

以原子方式阅读你的价值观，以原子方式写下你的价值观并保持冷静。

Answer 2

如果您希望在不使用互斥锁的情况下对类似数组的数据结构进行高性能多线程访问，则可以研究比较和交换。也许您可以设计一个适用于您特定问题的无锁数据结构。 https://en.wikipedia.org/wiki/Compare-and-swap

关于发布的代码，似乎你的问题有点太复杂了。如果你想实现：

foo1 (first half)
foo2 (first half) and foo1 (second half) (in parallel)
foo1 (first half) and foo2 (second half) (in parallel)
...
foo2(second half)

两个mutx应该这样做。

也许这可以做到。下面有一些伪代码：

// These global variables controls which thread is allowed to
// execute first and second half.
// 1 --> Foo1 may run
// 2 --> Foo2 may run
int accessFirstHalf = 1;
int accessSecondHalf = 1;

void foo1 ( data_pointers &data)
{
    while(YOU_LIKE_TO_GO_ON)
    {
        while (true)
        {
            TAKE_MUTEX_FIRST_HALF;
            if (accessFirstHalf == 1)
            {
                RELEASE_MUTEX_FIRST_HALF;
                break;
            }
            RELEASE_MUTEX_FIRST_HALF;
            pthread_yield();
        }

        // Do the first half

        TAKE_MUTEX_FIRST_HALF;
        // Allow Foo2 to do first half
        accessFirstHalf == 2;
        RELEASE_MUTEX_FIRST_HALF;

        while (true)
        {
            TAKE_MUTEX_SECOND_HALF;
            if (accessSecondHalf == 1)
            {
                RELEASE_MUTEX_SECOND_HALF;
                break;
            }
            RELEASE_MUTEX_SECOND_HALF;
            pthread_yield();
        }

        // Do the second half

        TAKE_MUTEX_SECOND_HALF;
        // Allow Foo2 to do second half
        accessSecondHalf == 2;
        RELEASE_MUTEX_SECOND_HALF;
    }
}


void foo2 ( data_pointers &data)
{
    while(YOU_LIKE_TO_GO_ON)
    {
        while (true)
        {
            TAKE_MUTEX_FIRST_HALF;
            if (accessFirstHalf == 2)
            {
                RELEASE_MUTEX_FIRST_HALF;
                break;
            }
            RELEASE_MUTEX_FIRST_HALF;
            pthread_yield();
        }

        // Do the first half

        TAKE_MUTEX_FIRST_HALF;
        // Allow Foo1 to do first half
        accessFirstHalf == 1;
        RELEASE_MUTEX_FIRST_HALF;

        while (true)
        {
            TAKE_MUTEX_SECOND_HALF;
            if (accessSecondHalf == 2)
            {
                RELEASE_MUTEX_SECOND_HALF;
                break;
            }
            RELEASE_MUTEX_SECOND_HALF;
            pthread_yield();
        }

        // Do the second half

        TAKE_MUTEX_SECOND_HALF;
        // Allow Foo1 to do second half
        accessSecondHalf == 1;
        RELEASE_MUTEX_SECOND_HALF;
    }
}


int main()
{
    // start the threads with foo1 and foo2
}

Answer 3

使用原子指针锁定＆＃39;的示例代码记忆中的某些位置：

#include <vector>
#include <atomic>
#include <thread>

using container = std::vector<std::atomic<double>>;
using container_size_type = container::size_type;

container c(300);

std::atomic<container::pointer> p_busy_elem{ nullptr };

void editor()
{
    for (container_size_type i{ 0 }, sz{ c.size() }; i < sz; ++i)
    {
        p_busy_elem.exchange(&c[i]); // c[i] is busy
        // ... edit c[i] ... // E: calculate a value and assign it to c[i]
        p_busy_elem.exchange(nullptr); // c[i] is no longer busy
    }
}

void reader()
{
    for (container_size_type i{ 0 }, sz{ c.size() }; i < sz; ++i)
    {
        // A1: wait for editor thread to finish editing value
        while (p_busy_elem == &c[i])
        {
            // A2: room a better algorithm to prevent blocking/yielding
            std::this_thread::yield();
        }

        // B: if c[i] is updated in between A and B, this will load the latest value
        auto value = c[i].load();

        // C: c[i] might have changed by this time, but we had the most up to date value we could get without checking again
        // ... use value ...
    }
}

int main()
{
    std::thread t_editor{ editor };
    std::thread t_reader{ reader };
    t_editor.join();
    t_reader.join();
}

在编辑器线程中，忙指针被设置为指示当前正在编辑该存储位置（ E ）。如果线程B在设置繁忙指针后尝试读取该值，它将等到编辑完成后再继续（ A1 ）。

关于 A2 的注意事项：可以在此处放置更好的系统。可以保留尝试读取时忙碌的节点列表，然后我们将i添加到该列表并尝试稍后处理列表。好处：可以告诉循环执行continue，并且将读取当前正在编辑的i之后的索引。

要使用的值（ B ）的副本，以便使用它（ C ），但需要。这是我们最后一次检查c[i]的最新值。

Answer 4

这似乎是您要求的核心：

foo1 (first half)
foo2 (first half) and foo1 (second half) (in parallel)
foo1 (first half) and foo2 (second half) (in parallel)
...
foo2(second half)

使用pthread实现此交错的最简单方法是使用障碍。

使用pthread_barrier_init()使用count初始化障碍foo1()然后执行：

first half
pthread_barrier_wait()
second half
pthread_barrier_wait()
...
first half
pthread_barrier_wait()
second half
pthread_barrier_wait()

和foo2()执行的顺序略有不同：

pthread_barrier_wait()
first half
pthread_barrier_wait()
second half
....
pthread_barrier_wait()
first half
pthread_barrier_wait()
second half

如何锁定阵列中元素的MUTEX，而不是整个数组

4 个答案: