Question

我想使用C ++ 17并行功能将std::vector的每个元素除以某个常量，并将结果存储在具有相同长度和（!!）顺序的另一个std::vector中。

E.g。

{6,9,12} / 3 = {2,3,4}

我有一个没有编译的例子

#include <execution>
#include <algorithm>

template <typename T>
std::vector<T> & divide(std::vector<T> const & in)
{
  std::vector<T> out(in.size(), 0);

  float const divisor = 3;

  std::for_each
  ( std::execution::par_unseq
  , in.begin()
  , in.end()
  , /* divide each element by divisor and put result in out */ );

  return out;
}

如何让这种运行，无锁和线程安全？

Answer 1

类似的东西：

#include <vector>
#include <algorithm>
#include <execution>

template <typename T>
std::vector<T> divide(std::vector<T> result)
{
    // ^^ take a copy of the argument here - will often be elided anyway 

    float const divisor = 3;

    // the following loop mutates distinct objects within the vector and
    // invalidates no iterators. c++ guarantees that each object is distinct
    // and that neighbouring objects may be updated by different threads
    // at the same time without a mutex.
    std::for_each(
        std::execution::par, 
        std::begin(result), 
        std::end(result), 
        [divisor](T& val) {  // copies are safer, and the resulting code will be as quick.
            // modifies value in place
            val /= divisor;
        });

    // implicit fence here. Safe to manipulate the vector as a whole.
    // from here on

  // return by value. Allows RVO.
  return result;
}

Answer 2

为此您需要 std::transform，而不是 std::for_each。 Transofrm 接受输入和输出迭代器。

std::transform 的好处在于，如果需要，可以轻松分配到多个 CPU 内核。所以：

#include <execution>
#include <algorithm>

template <typename T>
std::vector<T> & divide(std::vector<T> const & in)
{
  std::vector<T> out(in.size(), 0);

  float const divisor = 3;

  std::transform
  ( std::execution::par_unseq
    in.begin(),
    in.end(),
    out.begin(),
    out.end(),
        [divisor](float val) {
            // modifies value in place
            return val / divisor;
        });

  return out;
}

旁注：如果您喜欢速度，请启用 -ffast-math 或乘以 (1 / divisor)

C ++ 17 / C ++ 1z并行使用std :: for_each

2 个答案: