Question

我正在编写一个“错误检查”函数，其唯一的任务是检查单个if语句。这个函数重载了7种不同的错误类型，但首先让我提出我的代码：

ErrorCheck.h :(整个文件）

#pragma once
#ifndef __ERROR_CHECK_H__
#define __ERROR_CHECK_H__
// Header does not #include <***> anything. For obvious reasons.

// The function:
template <typename T>
bool errchk(const T check, const char* file, unsigned int line, const char* from, const char* func);

// How To call it:
#define ERRCHK(_check) \
    errchk(_check, __FILE__, __LINE__, __FUNC__, #_check)

#endif // !__ERROR_CHECK_H__

ErrorCheck.cpp :(简化版）

// Include:
#include <cuda.h>
#include <cuda_runtime_api.h>
#include <cufft.h>
#include <cublas.h>
#include <curand_kernel.h>
#include <cusolver_common.h>
#include "cusparse.h"
#include "ErrorCheck.h"

// Functions bellow are overloaded 7 times for every error type from headers included above
const char * getErrorName(const Type & error) { /* ... */ };
const char * getErrorString(const Type & error) { /* ... */ };

// The function:
template <typename T, T successValue>
bool errchk(const T check, const char* file, unsigned int line, const char* from, const char* func)
{
    if (check != successValue) {
        // Report Error
        return true; // Error was found.
    }
    return false; // No error.
}

// Instantiations:
template bool errchk <bool            > (const bool             check, const char * file, unsigned int line, const char * from, const char * func);
template bool errchk <cudaError_t     > (const cudaError_t      check, const char * file, unsigned int line, const char * from, const char * func);
template bool errchk <cufftResult_t   > (const cufftResult_t    check, const char * file, unsigned int line, const char * from, const char * func);
template bool errchk <cublasStatus_t  > (const cublasStatus_t   check, const char * file, unsigned int line, const char * from, const char * func);
template bool errchk <curandStatus_t  > (const curandStatus_t   check, const char * file, unsigned int line, const char * from, const char * func);
template bool errchk <cusolverStatus_t> (const cusolverStatus_t check, const char * file, unsigned int line, const char * from, const char * func);
template bool errchk <cusparseStatus_t> (const cusparseStatus_t check, const char * file, unsigned int line, const char * from, const char * func);

问题：＃1
是否可以优化bool errchk <***> (***) 中的if语句我知道这个函数是在运行时调用的，但是如果我们再考虑一下，我们会看到我们正在比较两个枚举。因此meaby可以强制编译器检查if语句的每个可能结果，然后在运行时运行正确的结果吗？

问题：＃2
是否需要优化？
使用#include <chrono> lib，我计算出当检测到“成功”值时，此代码最多需要40纳秒。检测到“错误”值时最多400毫秒。

Answer 1

是否可以在bool errchk <***> (***)内优化if语句？

在某些情况下，编译器是可能的，但不适合你。您正在编写一个函数，该函数无法对check的值进行编译时假设。但是，编译器可能会注意到你打电话，说errchk(cudaSuccess, whatever, etc, etc);并且它可以选择inline函数，在这种情况下，它可以注意到if (check != successValue)始终为真，并且只是优化掉整个调用。

是否需要优化？

可能不是。如果你在一个性能关键的紧密循环中有这个代码，你应该把它从循环中取出来;如果你在其他地方有它，40 ns并不是什么大问题。但是 - 您需要profile您的代码才能知道您应该优化什么。不要浪费你的时间来优化只占执行时间的一小部分的事情。

说到性能分析，CUDA提供了profiling facility，它也可以用于主机端代码。您还可以考虑通过我的C++ish wrappers使用它来获取CUDA Runtime API（here是特定于配置文件的API包装器。）

PS：在我看来，你可能根本不应该写一个errchk函数。您是否正在使用C ++，还记得吗？而不是宏和模板的这种可怕的近亲繁殖 - 使用exceptions;并且您不必再记得在每次通话后检查返回值。例外还可以让您通过班级区分错误类型;将错误嵌套在错误中;使用不止一个数字等来表达错误信息。

是否可以在编译时检查包含枚举的语句

1 个答案: