在尝试使用thrust::device_vector<unsigned char>
构造thrust::device_vector<unsigned char> data(10)
时遇到一个奇怪的错误。错误为“ parallel_for失败:设备功能无效”。
这是我重现此错误的最低代码。
main.cpp
#ifdef UNIT_TEST
#define CATCH_CONFIG_MAIN
#include "catch.hpp"
#endif // UNIT_TEST
myheader.h
#ifndef MYHEADER_H_
#define MYHEADER_H_
#include <string>
#include <vector>
#include <thrust/device_vector.h>
namespace AAA {
namespace BBB{
using byte_t = unsigned char;
using ByteValues = std::vector<byte_t>;
using DByteValues = thrust::device_vector<byte_t>;
#define NaB byte_t(-1) // Not-a-Byte
} // namespace BBB
} // namespace AAA
#endif // MYHEADER_H_
mytest.cu
#include "catch.hpp"
#include "myheader.h"
namespace AAA {
namespace BBB {
TEST_CASE("Test thrust::device_vector", "[thrust::device_vector]") {
SECTION("constructor should work") {
REQUIRE_NOTHROW( DByteValues(10) );
}
}
}
}
构建命令
g++ -DUNIT_TEST -std=c++14 -g3 -O0 -Wall -fmessage-length=0 -pthread -I/usr/local/cuda/include -Iinclude -Iunit-test -I/usr/local/include -c -o .obj/debug/./main.o main.cpp
nvcc -DUNIT_TEST -std=c++14 -m64 -arch=compute_30 -code=sm_30 -dc -expt-extended-lambda -g -G -Xcompiler -Wall,-fmessage-length=0,-pthread -I/usr/local/cuda/include -Iinclude -Iunit-test -I/usr/local/include -c -o .obj/debug/unit-test/mytest.o unit-test/mytest.cu
nvcc -Xlinker -s -L/usr/local/lib -L/usr/local/cuda/lib64 -lcudart -o .bin/debug/MyTest .obj/debug/./main.o .obj/debug/unit-test/mytest.o
系统信息
$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2018 NVIDIA Corporation
Built on Sat_Aug_25_21:08:01_CDT_2018
Cuda compilation tools, release 10.0, V10.0.130
$ g++ --version
g++ (Ubuntu 7.3.0-27ubuntu1~18.04) 7.3.0
Copyright (C) 2017 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
$ nvidia-smi
Mon Jan 21 22:34:40 2019
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 410.79 Driver Version: 410.79 CUDA Version: 10.0 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 745 Off | 00000000:01:00.0 On | N/A |
| 20% 41C P8 N/A / N/A | 158MiB / 4040MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 1399 G /usr/lib/xorg/Xorg 59MiB |
| 0 1556 G /usr/bin/sddm-greeter 95MiB |
+-----------------------------------------------------------------------------+
构建测试项目后,我运行它并收到以下错误。
$ .bin/debug/MyTest
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
MyTest is a Catch v2.5.0 host application.
Run with -? for options
-------------------------------------------------------------------------------
Test thrust::device_vector
constructor should work
-------------------------------------------------------------------------------
unit-test/mytest.cu:9
...............................................................................
unit-test/mytest.cu:10: FAILED:
REQUIRE_NOTHROW( DByteValues(10) )
due to unexpected exception with message:
parallel_for failed: invalid device function
===============================================================================
test cases: 1 | 1 failed
assertions: 1 | 1 failed
请帮助。谢谢!