Question

#include <stdio.h>
#include <stdlib.h>
#include <cuda.h>
#include <thrust/device_vector.h>
#include <thrust/host_vector.h>
#include <thrust/scan.h>
#include <thrust/execution_policy.h>
#include <iostream>
#include <thrust/transform.h>
struct text_functor {
    text_functor() {}
    __host__ __device__ int operator()(const char t) const {
        if (t == '\n') return 0;
        return 1;
    }
};

void CountPosition1(const char *text, int *pos, int text_size)
{
    thrust::transform(text, text + text_size, pos, text_functor());
}
int main() {
    char s[4] = {'a', 'a', '\n', 'a'};
    int r[4] = {0};
    int *k;
    cudaMalloc((void**) &k, sizeof(int) * 4);
    CountPosition1(s, k, 4);
}

在thrust :: transform中，我混合了主机迭代器s和设备迭代器k。这导致分段错误。如果我在CountPosition1中将参数k更改为r，则程序将是正确的。推力函数中的所有迭代器是否应来自同一源（主机或两个设备）？或者这段代码有什么问题吗？

Answer 1

是的，要么所有迭代器都应来自主机容器，要么所有迭代器都来自设备容器。

算法调度时，thrust will dispatch either the host path or the device path。所有迭代器都应该与调度方法一致。

推力迭代器混合使用

1 个答案: