Thrust:使用device_ptr时如何获取copy_if函数复制的元素数量

时间:2013-10-24 14:18:25

标签: cuda thrust

我正在使用Thrust库的thrust :: copy_if函数,加上计算迭代器以获取数组中非零元素的索引。我还需要获取复制元素的数量。

我正在使用'counting_iterator.cu'示例中的代码,除了在我的应用程序中我需要重用预先分配的数组,所以我用thrust :: device_ptr包装它们然后将它们传递给thrust :: copy_if功能。这是代码:

using namespace thrust;

int output[5];
thrust::device_ptr<int> tp_output = device_pointer_cast(output);

float stencil[5];
stencil[0] = 0;
stencil[1] = 0;
stencil[2] = 1;
stencil[3] = 0;
stencil[4] = 1;
device_ptr<float> tp_stencil = device_pointer_cast(stencil);

device_vector<int>::iterator output_end = copy_if(make_counting_iterator<int>(0), 
     make_counting_iterator<int>(5), 
     tp_stencil, 
     tp_output, 
     _1 == 1);

int number_of_ones = output_end - tp_output;

如果我注释最后一行代码,该函数会正确填充输出数组。但是,当我取消注释它时,我得到以下编译错误:

1&gt; C:\ Program Files \ NVIDIA GPU Computing Toolkit \ CUDA \ v5.5 \ include \ thrust / iterator / iterator_adaptor.h(223):error:no operator“ - ”匹配这些操作数

1&GT;操作数类型是:int * const - const thrust :: device_ptr

1&GT;检测期间: 1 GT;实例化“thrust :: iterator_adaptor :: difference_type thrust :: iterator_adaptor :: distance_to(const thrust :: iterator_adaptor&amp;)const [with Derived = thrust :: detail :: normal_iterator&gt ;,Base = thrust :: device_ptr,Value = thrust :: use_default,System = thrust :: use_default,Traversal = thrust :: use_default,Reference = thrust :: use_default,Difference = thrust :: use_default,OtherDerived = thrust :: device_ptr,OtherIterator = int *,V = signed int,S = thrust :: device_system_tag,T = thrust :: random_access_traversal_tag,R = thrust :: device_reference,D = ptrdiff_t]“ 1 GT; C:\ Program Files \ NVIDIA GPU Computing Toolkit \ CUDA \ v5.5 \ include \ thrust / iterator / iterator_facade.h(181):here 1 GT;实例化“Facade1 :: difference_type thrust :: iterator_core_access :: distance_from(const Facade1&amp;,const Facade2&amp;,thrust :: detail :: true_type)[与Facade1 = thrust :: detail :: normal_iterator&gt;,Facade2 =推力: :device_ptr]” 1 GT; C:\ Program Files \ NVIDIA GPU Computing Toolkit \ CUDA \ v5.5 \ include \ thrust / iterator / iterator_facade.h(202):here 1 GT;实例化“thrust :: detail :: distance_from_result :: type thrust :: iterator_core_access :: distance_from(const Facade1&amp;,const Facade2&amp;)[与Facade1 = thrust :: detail :: normal_iterator&gt ;, Facade2 = thrust :: device_ptr ]” 1 GT; C:\ Program Files \ NVIDIA GPU Computing Toolkit \ CUDA \ v5.5 \ include \ thrust / iterator / iterator_facade.h(506):here 1 GT;实例化“thrust :: detail :: distance_from_result,thrust :: iterator_facade&gt; :: type thrust :: operator-(const thrust :: iterator_facade&amp;,const thrust :: iterator_facade&amp;)[with Derived1 = thrust :: detail: :normal_iterator&gt;,Value1 = signed int,System1 = thrust :: device_system_tag,Traversal1 = thrust :: random_access_traversal_tag,Reference1 = thrust :: device_reference,Difference1 = signed int,Derived2 = thrust :: device_ptr,Value2 = signed int,System2 = thrust :: device_system_tag,Traversal2 = thrust :: random_access_traversal_tag,Reference2 = thrust :: device_reference,Difference2 = signed int]“ 1 GT; C:/ ProgramData / NVIDIA Corporation / CUDA Samples / v5.5 / 7_CUDALibraries / nsgaIIparallelo_23ott / rank_cuda.cu(70):here

如果我使用thrust :: device_vector代替输出数组,那么一切都还可以:

using namespace thrust;

thrust::device_vector<int> output(5);

float stencil[5];
stencil[0] = 0;
stencil[1] = 0;
stencil[2] = 1;
stencil[3] = 0;
stencil[4] = 1;
device_ptr<float> tp_stencil = device_pointer_cast(stencil);

device_vector<int>::iterator output_end = copy_if(make_counting_iterator<int>(0), 
     make_counting_iterator<int>(5), 
     tp_stencil, 
     output.begin(), 
     _1 == 1);

int number_of_ones = output_end - output.begin();

你能建议解决这个问题吗?谢谢。

1 个答案:

答案 0 :(得分:2)

尝试在copy_if调用中使用device_ptr而不是device_vector :: iterator:

thrust::device_ptr<int> output_end = copy_if(make_counting_iterator<int>(0),
 make_counting_iterator<int>(5),
 tp_stencil,
 tp_output,
 _1 == 1);