当我使用thrust :: counting_iterator时,如何为Thrust 1.7(CUDA 5.5)选择后端?

时间:2013-12-23 23:59:19

标签: cuda gpgpu nvidia thrust

当我使用thrust::device_vector的迭代器提供算法时,Thrust会自动选择GPU后端,因为矢量的数据存在于GPU上。但是,当我只向算法提供thrust::counting_iterator个参数时,如何选择它在哪个后端执行?

在下面的thrust::find调用中,没有device_vector迭代器参数,那么Thrust如何选择使用哪个后端(CPU,OMP,TBB,CUDA)?

如何在不使用此代码中的thrust::device_vector<>的情况下控制此算法执行的后端?

thrust::counting_iterator<uint64_t> first(i);
thrust::counting_iterator<uint64_t> last = first + step_size;

auto iter = thrust::find( 
            thrust::make_transform_iterator(first, functor),
            thrust::make_transform_iterator(last, functor),
            true);

更新23.01.14。 MSVS2012,CUDA5.5,Thrust 1.7

编译成功!

#include <iostream>
#include <thrust/iterator/counting_iterator.h>
#include <thrust/iterator/transform_iterator.h>
#include <thrust/find.h>
#include <thrust/functional.h>

#include <thrust/execution_policy.h>

struct is_odd : public thrust::unary_function<uint64_t, bool> {
  __host__ __device__ bool operator()(uint64_t const& x) {
    return x & 1;
  }
};

int main() {
    thrust::counting_iterator<uint64_t> first(0);
    thrust::counting_iterator<uint64_t> last = first + 100;

    auto iter = thrust::find(thrust::device,
                thrust::make_transform_iterator(first, is_odd()),
                thrust::make_transform_iterator(last, is_odd()),
                true);

    int bbb; std::cin >> bbb;
    return 0;
}

3 个答案:

答案 0 :(得分:3)

有时Thrust算法的执行方式可能不明确,如counting_iterator示例所示,因为其关联的“后端系统”为thrust::any_system_tagcounting_iterator可以在任何地方取消引用,因为它是没有数据支持)。在这种情况下,Thrust将使用设备后端。默认情况下,这将是CUDA。但是,您可以通过几种方式明确控制执行的执行方式。

您可以通过模板参数显式指定系统,如ngimel的答案,或者您可以在示例中将thrust::device执行策略作为thrust::find的第一个参数提供:

#include <thrust/execution_policy.h>
...
thrust::counting_iterator<uint64_t> first(i);
thrust::counting_iterator<uint64_t> last = first + step_size;

auto iter = thrust::find(thrust::device,
                         thrust::make_transform_iterator(first, functor),
                         thrust::make_transform_iterator(last, functor),
                         true);

此技术需要Thrust 1.7或更高。

答案 1 :(得分:2)

实例化counting_iterator时必须指定系统模板参数:

 typedef thrust::device_system_tag  System;
 thrust::counting_iterator<uint64_t,System> first(i)

答案 2 :(得分:1)

如果您使用的是当前版本的Thrust,请按照Jared Hoberock提到的方式进行操作。但是,如果你可能使用旧版本(你工作的系统可能有旧版本的CUDA),那么下面的例子可能有所帮助。

#include <thrust/version.h>

#if THRUST_MINOR_VERSION > 6
    #include <thrust/execution_policy.h>
#elif THRUST_MINOR_VERSION == 6
    #include <thrust/iterator/retag.h>
#else
#endif

...

#if THRUST_MINOR_VERSION > 6
  total = 
    thrust::transform_reduce(
      thrust::host
      , thrust::counting_iterator<unsigned int>(0)
      , thrust::counting_iterator<unsigned int>(N)
      , AFunctor(), 0, thrust::plus<unsigned int>());
#elif THRUST_MINOR_VERSION == 6
  total = 
    thrust::transform_reduce(
      thrust::retag<thrust::host_system_tag>(thrust::counting_iterator<unsigned int>(0)) 
      , thrust::retag<thrust::host_system_tag>(thrust::counting_iterator<unsigned int>(N))
      , AFunctor(), 0, thrust::plus<unsigned int>());
#else
  total = 
    thrust::transform_reduce(
      thrust::counting_iterator<unsigned int, thrust::host_space_tag>(0)
      , thrust::counting_iterator<unsigned int, thrust::host_space_tag>(objectCount)
      , AFunctor(), 0, thrust::plus<unsigned int>());
#endif

@see Thrust: How to directly control where an algorithm invocation executes?