Question

请考虑以下代码：

#include <time.h>       // --- time
#include <stdlib.h>     // --- srand, rand
#include<fstream>

#include <thrust\host_vector.h>
#include <thrust\device_vector.h>
#include <thrust\sort.h>
#include <thrust\iterator\zip_iterator.h>

#include "TimingGPU.cuh"

/********/
/* MAIN */
/********/
int main() {

    const int N = 16384;

    std::ifstream h_indices_File, h_x_File;
    h_indices_File.open("h_indices.txt");
    h_x_File.open("h_x.txt");

    std::ofstream h_x_result_File;
    h_x_result_File.open("h_x_result.txt");

    thrust::host_vector<int> h_indices(N);
    thrust::host_vector<double> h_x(N);
    thrust::host_vector<double> h_sorted(N);

    for (int k = 0; k < N; k++) {
        h_indices_File >> h_indices[k];
        h_x_File >> h_x[k];
    }

    thrust::device_vector<int> d_indices(h_indices);
    thrust::device_vector<double> d_x(h_x);

    thrust::gather(d_indices.begin(), d_indices.end(), d_x.begin(), d_x.begin());
    h_x = d_x;
    for (int k = 0; k < N; k++) h_x_result_File << h_x[k] << "\n";

    //thrust::device_vector<double> d_x_sorted(N);
    //thrust::gather(d_indices.begin(), d_indices.end(), d_x.begin(), d_x_sorted.begin());
    //h_x = d_x_sorted;
    //for (int k = 0; k < N; k++) h_x_result_File << h_x[k] << "\n";

}

代码从文件加载索引h_indices.txt和double数组h_x.txt的数组。然后，它将这些数组传输到GPU d_indices和d_x并使用thrust::gather来实现Matlab的等效

d_x(d_indices)

可以从h_indices.txt和h_x.txt下载两个txt文件。代码创建输出结果文件h_x_result.txt。

如果我使用＆＃34;就地＆＃34; thrust::gather的版本（代码的最后一个未注释的三行），然后我得到的结果与d_x(d_indices)不同，而如果我使用not＆＃34;就地＆＃34;版本（最后评论了三行代码），结果是正确的。

在Matlab中，我使用

load h_indices.txt; load h_x.txt; load h_x_result.txt
plot(h_x(h_indices + 1)); hold on; plot(h_x_result, 'r'); hold off

＆＃34;就地＆＃34; case返回以下比较

另一方面，＆＃34;就地＆＃34;案件返回

我使用的是Windows 10，CUDA 8.0，Visual Studio 2013，在发布模式下进行编译，并在NVIDIA GTX 960 cc上运行。 5.2。

Answer 1

Thrust gather无法使用。

但是我会建议不能在原地安全地执行“天真”聚集操作，并且你在原地呈现的Matlab片段（大概是d_x = d_x(d_indices)）不是根本就是就地操作。

可以推力::聚集使用＆＃34;就地＆＃34 ;?

1 个答案: