Question

我正在编写Windows系统上的一些CUDA内核。根据我的理解，nvcc编译器需要使用cl.exe在Windows系统上进行编译。获得此功能的主要方法是使用Visual Studio。因此我安装了free community edition。之后我希望bin目录中有VC目录，如this one和this one等多个其他问题所示。然而，我需要更深入地找几层才能找到

C:\Program Files (x86)\Microsoft Visual Studio\2017\Community\VC\Tools\MSVC\14.10.25017\bin\HostX64\x64\cl.exe

此特定项目旨在制作可在多个不同Windows系统上编译和使用的程序。我真的需要期望cl.exe文件是嵌套的，还是我错过了某种安装步骤？我期待更短的路径：

C:\Program Files (x86)\Microsoft Visual Studio\2017\Community\VC\bin\

最终，我需要尽可能简单的方式让用户能够让他们的环境找到cl.exe文件。通常，这涉及（在最高级别）设置环境变量。

Answer 1

我在不同的上下文中遇到了这个问题（Elixir / Phoenix，Rust），但根本原因是相同的：在编译期间找不到cl.exe。

我的设置是：

Windows 10，x64
已安装Visual Studio Community 2017，但仅适用于C＃development

出于某种原因，安装Visual C++ Build Tools的解决方案（如@cozzamara建议的）不起作用。在安装过程中停止并显示一些模糊的错误消息。猜猜它不喜欢我现有的Visual Studio安装。

这就是我解决它的方法：

启动Visual Studio Installer
查看Desktop development with C++（屏幕截图here）
编译前执行以下命令：
```
C:\Program Files (x86)\Microsoft Visual Studio\2017\Community\VC\Auxiliary\Build\vcvars64.bat
```
从此命令cl.exe起作用。或者（并且更方便地进行开发）启动应用程序“Developer Command Prompt for VS 2017”或“x64 Native Tools命令提示符VS 2017”。

Answer 2

寻找VCVARSALL.BAT - 通常是在更高的级别。如果你运行它，它会设置你的环境，这样你就可以在没有路径的情况下调用CL。

此处的文档：https://msdn.microsoft.com/en-us/library/f2ccy3wt.aspx

Answer 3

我不确定为什么但是Path似乎没有更新。尝试从Visual Studio 2017＆＃34;＆＃34;开发人员命令提示符运行命令。

Answer 4

我遇到了类似的问题，Visual Studio 2017无法为x64配置找到CL.exe或MIDL.exe。可以在VS命令提示符下找到文件所在的文件，但在从Visual Studio构建文件时找不到（但它确实适用于x86）。

当我打开Diagnostics的生成输出的详细信息（工具=>选项=>项目与解决方案=>生成并运行=> MSBuild项目生成输出的详细信息）时，我确实注意到PATH在x64的“ SetEnv”构建步骤。但是，我尝试重新安装Visual Studio，单个组件，sdk，运行时，注册表清理等工作量却没有解决（我几乎要重新安装Windows）。

然后，我发现Visual Studio C ++项目可能会从您的应用程序数据文件夹中导入“ user.props”文件；这是项目文件中的此部分：

<ImportGroup Condition="'$(Configuration)|$(Platform)'=='Debug|x64'" Label="PropertySheets">
   <Import Project="$(UserRootDir)\Microsoft.Cpp.$(Platform).user.props" Condition="exists('$(UserRootDir)\Microsoft.Cpp.$(Platform).user.props')" Label="LocalAppDataPlatform" />
</ImportGroup>

$（UserRootDir）在我的PC上评估为C：\ Users [用户名] \ AppData \ Local \ Microsoft \ MSBuild \ v4.0，在其中找到了Microsofr.Cpp.xxx.user.props文件。这些文件具有旧路径（早期安装和其他工具的遗留物）。

所以对我来说，解决方案是删除AppData文件夹中的这些prop文件。

Answer 5

我尝试了Theo的配置Visual Studio的解决方案，但这对我不起作用。我正在Windows 10 CUDA工具包10.0上运行Visual Studio社区2017。确切地说，我去了C:\Program Files (x86)\Microsoft Visual Studio\2017\Community\VC\Auxiliary\Build并跑了vcvarsamd64_x86.bat。由于找不到cl.exe，我的PyCUDA仍然无法编译。

我最终在Visual Studio 2017上创建了一个测试CUDA项目（“文件”->“新项目”），并在左侧选择了适当的CUDA。

然后我构建（Ctrl + Shift + B或转到“构建”->“构建解决方案”）显示的示例（这是一个简单的矢量加法，在下面复制）。

#include "cuda_runtime.h"
#include "device_launch_parameters.h"

#include <stdio.h>

cudaError_t addWithCuda(int *c, const int *a, const int *b, unsigned int size);

__global__ void addKernel(int *c, const int *a, const int *b)
{
    int i = threadIdx.x;
    c[i] = a[i] + b[i];
}

int main()
{
    const int arraySize = 5;
    const int a[arraySize] = { 1, 2, 3, 4, 5 };
    const int b[arraySize] = { 10, 20, 30, 40, 50 };
    int c[arraySize] = { 0 };

    // Add vectors in parallel.
    cudaError_t cudaStatus = addWithCuda(c, a, b, arraySize);
    if (cudaStatus != cudaSuccess) {
        fprintf(stderr, "addWithCuda failed!");
        return 1;
    }

    printf("{1,2,3,4,5} + {10,20,30,40,50} = {%d,%d,%d,%d,%d}\n",
        c[0], c[1], c[2], c[3], c[4]);

    // cudaDeviceReset must be called before exiting in order for profiling and
    // tracing tools such as Nsight and Visual Profiler to show complete traces.
    cudaStatus = cudaDeviceReset();
    if (cudaStatus != cudaSuccess) {
        fprintf(stderr, "cudaDeviceReset failed!");
        return 1;
    }

    return 0;
}

// Helper function for using CUDA to add vectors in parallel.
cudaError_t addWithCuda(int *c, const int *a, const int *b, unsigned int size)
{
    int *dev_a = 0;
    int *dev_b = 0;
    int *dev_c = 0;
    cudaError_t cudaStatus;

    // Choose which GPU to run on, change this on a multi-GPU system.
    cudaStatus = cudaSetDevice(0);
    if (cudaStatus != cudaSuccess) {
        fprintf(stderr, "cudaSetDevice failed!  Do you have a CUDA-capable GPU installed?");
        goto Error;
    }

    // Allocate GPU buffers for three vectors (two input, one output)    .
    cudaStatus = cudaMalloc((void**)&dev_c, size * sizeof(int));
    if (cudaStatus != cudaSuccess) {
        fprintf(stderr, "cudaMalloc failed!");
        goto Error;
    }

    cudaStatus = cudaMalloc((void**)&dev_a, size * sizeof(int));
    if (cudaStatus != cudaSuccess) {
        fprintf(stderr, "cudaMalloc failed!");
        goto Error;
    }

    cudaStatus = cudaMalloc((void**)&dev_b, size * sizeof(int));
    if (cudaStatus != cudaSuccess) {
        fprintf(stderr, "cudaMalloc failed!");
        goto Error;
    }

    // Copy input vectors from host memory to GPU buffers.
    cudaStatus = cudaMemcpy(dev_a, a, size * sizeof(int), cudaMemcpyHostToDevice);
    if (cudaStatus != cudaSuccess) {
        fprintf(stderr, "cudaMemcpy failed!");
        goto Error;
    }

    cudaStatus = cudaMemcpy(dev_b, b, size * sizeof(int), cudaMemcpyHostToDevice);
    if (cudaStatus != cudaSuccess) {
        fprintf(stderr, "cudaMemcpy failed!");
        goto Error;
    }

    // Launch a kernel on the GPU with one thread for each element.
    addKernel<<<1, size>>>(dev_c, dev_a, dev_b);

    // Check for any errors launching the kernel
    cudaStatus = cudaGetLastError();
    if (cudaStatus != cudaSuccess) {
        fprintf(stderr, "addKernel launch failed: %s\n", cudaGetErrorString(cudaStatus));
        goto Error;
    }

    // cudaDeviceSynchronize waits for the kernel to finish, and returns
    // any errors encountered during the launch.
    cudaStatus = cudaDeviceSynchronize();
    if (cudaStatus != cudaSuccess) {
        fprintf(stderr, "cudaDeviceSynchronize returned error code %d after launching addKernel!\n", cudaStatus);
        goto Error;
    }

    // Copy output vector from GPU buffer to host memory.
    cudaStatus = cudaMemcpy(c, dev_c, size * sizeof(int), cudaMemcpyDeviceToHost);
    if (cudaStatus != cudaSuccess) {
        fprintf(stderr, "cudaMemcpy failed!");
        goto Error;
    }

Error:
    cudaFree(dev_c);
    cudaFree(dev_a);
    cudaFree(dev_b);

    return cudaStatus;
}

此构建成功后，我查看了用于运行该构建的命令，其中包含以下路径：C:\Program Files (x86)\Microsoft Visual Studio\2017\Community\VC\Tools\MSVC\14.15.26726\bin\HostX86\x64。我将此添加到环境变量PATH中，现在我的PyCUDA可以使用了！（当我访问该路径时，发现了一个cl.exe）

TL; DR

使用Visual Studio创建和构建CUDA项目。建立它。成功完成后，查看build命令并将路径从那里复制到PATH。

Answer 6

下载http://landinghub.visualstudio.com/visual-cpp-build-tools并暂停它。它解决了我的问题。

Visual Studio社区2017 cl.exe

6 个答案: