Question

我正在做一个项目（本质上是物理模拟），在这个项目中，我需要在许多时间步长上对大量节点进行计算。我目前已通过编写自定义函子（在thrust::transform中调用）来实现每种类型的计算。

作为一个最小的示例（带有伪代码），假设我有一些数据都共享相同的结构，但可以分解为不同的类型（A，B和C），例如都有

double value.

因此，我将这些数据存储在单个device_vector中，如下所示：

class Data {
    thrust::device_vector<double> values;
    unsigned values_begin_A, values_end_A;
    unsigned values_begin_B, values_end_B;
    unsigned values_begin_C, values_end_C;
}

其中类型A占据向量的第一部分，其次是类型B，然后是类型C。为了保持跟踪，我保存了每种类型的开始/结束索引值。

不同类型的函子需要处理不同类型的数据（例如functor1应用于类型A和B; functor 2应用于A，B和C; functor3应用于A和C）。每个函子都需要访问由counting_iterator提供的向量中的值的索引，并将结果存储在单独的向量中。

struct my_functor : public thrust::unary_function< thrust::tuple<unsigned, double> , double > {

    __host__ __device__
    double operator() (const thrust::tuple<unsigned, double> index_value) {

        // Do something with the index and value.

        return result;
    }
}

我的问题是，我不知道实现在跳过B时作用于类型A和C的最后一个函子的最佳方法。特别是，我正在寻找一种推力友好的解决方案，该解决方案可以合理地缩放添加更多的节点类型和更多的函子（作用于新旧类型的组合），同时仍可从并行化中受益。

我提出了四个选择：

选项1：

每种数据类型都有一个转换调用，例如

void Option_One(thrust::device_vector<double>& result) {
    // Multiple transform calls.

    thrust::counting_iterator index(0);

    // Apply functor to 'A' values.
    thrust::transform( 
        thrust::make_zip_iterator(thrust::make_tuple(index, values.begin())),
        thrust::make_zip_iterator(thrust::make_tuple(index, values.begin())) + values_end_A,
        result.begin(),
        my_functor());

    // Apply functor to 'C' values.
    thrust::transform( 
        thrust::make_zip_iterator(thrust::make_tuple(index, values.begin())) + values_begin_C,
        thrust::make_zip_iterator(thrust::make_tuple(index, values.begin())) + values_end_C,
        result.begin() + values_begin_C,
        my_functor());
}

这似乎是以效率为代价的，很简单，因为我牺牲了并行评估A和C的能力。

选项2：

将值复制到临时向量中，对临时向量进行转换，然后将临时结果复制回结果中。似乎很多来回复制，但是只允许在A和C上仅一次调用转换。

void Option_Two(thrust::device_vector<double>& result) {

    // Copy 'A' and 'C' values into temporary vector
    thrust::device_vector<double> temp_values_A_and_C(size_A + size_C);
    thrust::copy(values.begin(), values.begin() + values_end_A, temp_values_A_and_C.begin());
    thrust::copy(values.begin() + values_begin_C, values.begin() + values_end_C, temp_values_A_and_C.begin() + values_end_A);

    // Store results in temporary vector.
    thrust::device_vector<double> temp_results_A_and_C(size_A + size_C);

    thrust::transform( 
        thrust::make_zip_iterator(thrust::make_tuple(index, temp_values_A_and_C.begin())),
        thrust::make_zip_iterator(thrust::make_tuple(index, temp_values_A_and_C.begin())) + size_A + size_C,
        temp_results_A_and_C.begin(),
        my_functor());


    // Copy temp results back into result
    // ....
}

选项3：

对所有值调用转换，但更改函子以检查索引，并且仅对A或C范围内的索引起作用。

struct my_functor_with_index_checking : public thrust::unary_function< thrust::tuple<unsigned, double> , double > {

    __host__ __device__
    double operator() (const thrust::tuple<unsigned, double> index_value) {

        if ( (index >= values_begin_A && index <= values_end_A ) ||
            ( index >= values_begin_C && index <= values_end_C ) ) {

                // Do something with the index and value.
                return result;
             }
        else {
            // Do nothing;
            return 0; //Result is 0 by default.
        }
    }
}

void Option_Three(thrust::device_vector<double>& result) {

    // Apply functor to all values, but check index inside functor.
    thrust::transform( 
        thrust::make_zip_iterator(thrust::make_tuple(index, values.begin())),
        thrust::make_zip_iterator(thrust::make_tuple(index, values.begin())) + values.size(),
        result.begin(),
        my_functor_with_index_checking());
}

选项4：

我想到的最后一个选择是基于counting_iterator创建一个自定义迭代器，该迭代器通常在A范围内计数，但是一旦到达C的末尾就跳到C的开头。这似乎是一个不错的解决方案，但我不知道如何执行此操作。

void Option_Four(thrust::device_vector<double>& result) {

    // Create my own version of a counting iterator
    // that skips from the end of 'A' to the beginning of 'C'
    // I don't know how to do this!
    FancyCountingIterator fancyIndex(0); 

    thrust::transform( 
        thrust::make_zip_iterator(thrust::make_tuple(fancyIndex, values.begin())),
        thrust::make_zip_iterator(thrust::make_tuple(fancyIndex, values.begin())) + values.size(),
        result.begin(),
        my_functor());
}

如何通过跳过部分device_vector的自定义函子实现推力:::转换？

0 个答案: