我不断收到编译错误:
ptxas fatal : Unresolved extern function 'cudaMemcpyAsync'
在文件buffer.cuh
中:
__device__ void markBuffer(volatile bool* is_ready_for_write_list, void* volatile * data_list, void* data, size_t num_samples_per_read, size_t sample_offset, void** tensor_list) {
size_t index = getIndex(num_samples_per_read, sample_offset);
data_list[index] = data;
cudaMemcpyAsync((tensor_list)[index], ((int*)(data_list[index])) - (num_samples_per_read - 1), num_samples_per_read * 4, cudaMemcpyDeviceToDevice, NULL);
is_ready_for_write_list[index] = true;
}
buffer.cuh
被包含在nv_wavenet.cuh
中,而nv_wavenet.cuh
被包含在nv_wavenet_test.cu
中。
我正在使用:
nvcc -arch=sm_61 -std=c++11 -g --use_fast_math -G -g -maxrregcount 128 nv_wavenet_test.cu matrix.cpp nv_wavenet_reference.cpp -o nv_wavenet_test