Question

我正在尝试编写一个模拟，其中不同的线程需要基于父线程管理的原子模拟时间，在特定于线程的间隔（在这里的最小示例中，间隔在1到4之间）上执行给定的计算

想法是让父进程在单个时间步长内推进仿真（在这种情况下，为简单起见，始终为1），然后让所有线程独立检查是否需要进行计算，并且一旦检查了原子的减量计数器并等待下一步。我希望在运行此代码后，每个线程的计算数量将完全是模拟的长度（即10000个步骤）除以特定于线程的间隔（因此，对于4个线程间隔，该线程应精确执行 < / em> 2500次计算。

#include <thread> #include <iostream> #include <atomic> std::atomic<int> simTime; std::atomic<int> tocalc; int end = 10000; void threadFunction(int n); int main() { int nthreads = 4; std::thread threads[nthreads]; for (int ii = 0; ii < nthreads; ii ++) { threads[ii] = std::thread(threadFunction, ii+1); } simTime = 0; tocalc = 0; while (simTime < end) { tocalc = nthreads - 1; simTime += 1; // do calculation while (tocalc > 0) { // wait until all the threads have done their calculation // or at least checked to see if they need to } } for (int ii = 0; ii < nthreads; ii ++) { threads[ii].join(); } } void threadFunction(int n) { int prev = simTime; int fix = prev; int ncalcs = 0; while (simTime < end) { if (simTime - prev > 0) { prev = simTime; if (simTime - fix >= n) { // do calculation ncalcs ++; fix = simTime; } tocalc --; } } std::cout << std::to_string(n)+" {ncalcs} - "+std::to_string(ncalcs)+"\n"; }

但是，输出与该预期不一致，一个可能的输出是

2 {ncalcs} - 4992 1 {ncalcs} - 9983 3 {ncalcs} - 3330 4 {ncalcs} - 2448

预期输出是

2 {ncalcs} - 5000 1 {ncalcs} - 10000 3 {ncalcs} - 3333 4 {ncalcs} - 2500

我想知道是否有人对为什么这种强迫线程等待下一步的方法为何失败了-如果这可能是我的代码中的一个简单问题，或者是一个更根本的问题。方法。感谢您提供任何见识。

注意

我之所以使用这种方法，是因为我尝试过的其他方法（例如，使用pipes，在每个步骤中加入）的开销过高，如果我打开的线程之间进行通信的方法比较便宜这样的建议。

Answer 1

要扩展注释，将tocalc初始化为nthreads - 1意味着在 some 次迭代中，所有子线程将在父线程之前递减tocalc对其进行评估-对atomic的读取和写入由内存调度程序处理。因此有时序列可能会

子级1减tocalc，新值为2
子级3减tocalc，新值为1
子级4减tocalc，新值为0
孩子2递减tocalc，新值为-1
父母评估tocalc < 0是否返回true-模拟进展

有时，可以将父评估安排在最后一个线程递减tocalc之前，即

子级1减tocalc，新值为2
子级3减tocalc，新值为1
子级4减tocalc，新值为0
父母评估tocalc < 0是否返回true-模拟进展
子级2减tocalc，新值为2

在这种情况下，子线程2将丢失迭代。由于调度顺序的半随机性并非每次都会发生，因此未命中总数不是线程数的线性函数，而是总迭代次数的一小部分。如果将代码修改为下面的代码，将产生所需的结果。

#include <thread>
#include <iostream>
#include <atomic>

std::atomic<int> simTime;
std::atomic<int> tocalc;
int end = 10000;

void threadFunction(int n);

int main() {
    int nthreads = 4;
    simTime = 0;
    tocalc = 0;
    std::thread threads[nthreads];
    for (int ii = 0; ii < nthreads; ii ++) {
        threads[ii] = std::thread(threadFunction, ii+1);
    }

    int wait = 0;
    while (simTime < end) {
        tocalc = nthreads;
        simTime += 1;
        // do calculation
        while (tocalc > 0) {
            // wait until all the threads have done their calculation
            // or at least checked to see if they need to
        }
    }
    for (int ii = 0; ii < nthreads; ii ++) {
        threads[ii].join();
    }
}

void threadFunction(int n) {
    int prev = 0;
    int fix = prev;
    int ncalcs = 0;
    while (simTime < end) {
        if (simTime - prev > 0) {
            prev = simTime;
            if (simTime - fix >= n) {
                // do calculation
                ncalcs ++;
                fix = simTime;
            }
            tocalc --;
        }
    }
    std::cout << std::to_string(n)+" {ncalcs} - "+std::to_string(ncalcs)+"\n";
}

一个可能的输出是（线程完成的顺序有些随机）

2 {ncalcs} - 5000
3 {ncalcs} - 3333
1 {ncalcs} - 10000
4 {ncalcs} - 2500

Answer 2

使用类似的设置，我注意到并非每个线程都会达到您期望的数量，而是仅减少一个。即

2 {ncalcs} - 4999
4 {ncalcs} - 2500
1 {ncalcs} - 9999
3 {ncalcs} - 3333

之类的，似乎在线程和发生线程的线程数方面是随机的。尽管我不确定是什么原因引起的，但我认为发出警告可能会很好，但是您可以通过检查simTime - fix == 0是否存在来解决此问题，然后在退出之前进行另一次计算。

将子线程同步到父级管理的原子时间

注意

2 个答案: