Question

我试图对用C ++编写的算法的许多（约25个）变体进行基准测试。

我使用三种方法组合实现了这些变体：

复制代码并对复制的版本进行细微更改
继承基本算法类
使用#ifdef s在代码片段之间切换

选项1和2产生的变化是可以的，因为我可以选择在配置文件中运行算法的哪个变体。然后，我可以遍历不同的配置文件并保留“配置：结果”对的记录 - 保留这些记录对我的工作非常重要。

我目前遇到#ifdef的问题，因为我必须编译多个版本的代码来访问这些变体，这使得运行自动化实验脚本和保持结果的准确记录变得更加困难。然而，#ifdef非常有用，因为如果我在代码的一个副本中发现错误，那么我不必记得在多个副本中纠正这个错误。

#ifdef扩展了我通过复制代码和子类化为24种变体创建的六种变体（每种基本变体有4种变体）。

以下是一个示例 - 主要是我使用#ifdef来避免复制太多代码：

    ....

    double lasso_gam=*gamma;
    *lasso_idx=-1;
    for(int aj=0;aj<(int)a_idx.size();aj++){
        int j=a_idx[aj];
        assert(j<=C*L);
        double inc=wa[aj]*(*gamma)*signs[aj];
        if( (beta_sp(j)>0 && beta_sp(j)+inc<0)
#ifdef ALLOW_NEG_LARS
            || (beta_sp(j)<0 && beta_sp(j)+inc>0)
#else
            || (beta_sp(j)==0 && beta_sp(j)+inc<0)
#endif
            ){
            double tmp_gam=-beta_sp(j)/wa[aj]*signs[aj];

            if(tmp_gam>=0 && tmp_gam<lasso_gam) {
                *lasso_idx=aj;
                *next_active=j;
                lasso_gam=tmp_gam;
            }
        }
    }

    if(lasso_idx>=0){
        *gamma=lasso_gam;
    }

    ....

问题：在给定配置文件的情况下，允许运行#ifdef s当前指定的算法的多种变体的最佳方法是什么？要运行的算法。

理想情况下，我只想编译一次代码，并在运行时使用配置文件选择算法变体。

Answer 1

您可以使用（可能是附加的）模板参数来扩充算法，如下所示：

enum class algorithm_type
{
    type_a,
    type_b,
    type_c
};

template <algorithm_type AlgorithmType>
void foo(int usual, double args)
{
    std::cout << "common code" << std::endl;

    if (AlgorithmType == algorithm_type::type_a)
    {
        std::cout << "doing type a..." << usual << ", " << args << std::endl;
    }
    else if (AlgorithmType == algorithm_type::type_b)
    {
        std::cout << "doing type b..." << usual << ", " << args << std::endl;
    }
    else if (AlgorithmType == algorithm_type::type_c)
    {
        std::cout << "doing type c..." << usual << ", " << args << std::endl;
    }

    std::cout << "more common code" << std::endl;
}

现在您可以通过此模板参数选择您的行为：

foo<algorithm_type::type_a>(11, 0.1605);
foo<algorithm_type::type_b>(11, 0.1605);
foo<algorithm_type::type_c>(11, 0.1605);

作为常量表达式的类型产生一个编译时决定的分支（也就是说，已知其他分支是死代码并被删除）。事实上，你的编译器应该警告你（你如何处理这个问题取决于你）。

但你仍然可以很好地调度运行时值：

#include <stdexcept>

void foo_with_runtime_switch(algorithm_type algorithmType,
                             int usual, double args)
{
    switch (algorithmType)
    {
    case algorithm_type::type_a:
        return foo<algorithm_type::type_a>(usual, args);
    case algorithm_type::type_b:
        return foo<algorithm_type::type_b>(usual, args);
    case algorithm_type::type_c:
        return foo<algorithm_type::type_c>(usual, args);
    default:
        throw std::runtime_error("wat");
    }
}

foo_with_runtime_switch(algorithm_type::type_a, 11, 0.1605);
foo_with_runtime_switch(algorithm_type::type_b, 11, 0.1605);
foo_with_runtime_switch(algorithm_type::type_c, 11, 0.1605);

算法的内部结构保持不变（消除了死枝，同样的优化），只是你如何到达那里已经发生了变化。（请注意，可以概括enum的想法，以便自动生成此开关;如果您发现自己有一些变体，这可能是一个很好的学习。）

当然，您仍然可以#define将特定算法作为默认值：

#define FOO_ALGORITHM algorithm_type::type_a

void foo_with_define(int usual, double args)
{
    return foo<FOO_ALGORITHM>(usual, args);
}

foo_with_define(11, 0.1605);

所有这些共同为你提供了三者的优势，没有重复。

在实践中，您可以将所有三个作为重载：一个用于知道在编译时使用哪个算法的用户，那些需要在运行时选择它的用户，以及那些只想要默认的用户（可以通过项目范围#define）：

// foo.hpp

enum class algorithm_type
{
    type_a,
    type_b,
    type_c
};

// for those who know which algorithm to use
template <algorithm_type AlgorithmType>
void foo(int usual, double args)
{
    std::cout << "common code" << std::endl;

    if (AlgorithmType == algorithm_type::type_a)
    {
        std::cout << "doing type a..." << usual << ", " << args << std::endl;
    }
    else if (AlgorithmType == algorithm_type::type_b)
    {
        std::cout << "doing type b..." << usual << ", " << args << std::endl;
    }
    else if (AlgorithmType == algorithm_type::type_c)
    {
        std::cout << "doing type c..." << usual << ", " << args << std::endl;
    }

    std::cout << "more common code" << std::endl;
}

// for those who will know at runtime
void foo(algorithm_type algorithmType, int usual, double args)
{
    switch (algorithmType)
    {
    case algorithm_type::type_a:
        return foo<algorithm_type::type_a>(usual, args);
    case algorithm_type::type_b:
        return foo<algorithm_type::type_b>(usual, args);
    case algorithm_type::type_c:
        return foo<algorithm_type::type_c>(usual, args);
    default:
        throw std::runtime_error("wat");
    }
}

#ifndef FOO_ALGORITHM
    // chosen to be the best default by profiling
    #define FOO_ALGORITHM algorithm_type::type_b
#endif

// for those who just want a good default
void foo(int usual, double args)
{
    return foo<FOO_ALGORITHM>(usual, args);
}

当然，如果某些实现类型总是比其他类型更糟糕，那就去除它。但是如果你发现有两个有用的实现，那么保持这两种方式都没有坏处。

Answer 2

如果您有多个版本#ifdef，则通常最好构建多个可执行文件，并让配置脚本决定在进行基准测试时运行哪些可执行文件。然后，您的Makefile中有规则来构建各种配置：

%-FOO.o: %.cc
        $(CXX) -c $(CFLAGS) -DFOO -o $@ $<

%-BAR.o: %.cc
        $(CXX) -c $(CFLAGS) -DBAR -o $@ $<

test-FOO: $(SRCS:%.cc=%-FOO.o)
        $(CXX) $(LDFLAGS) -DFOO -o $@ $^ $(LDLIBS)

Answer 3

如果您的#if四处散落并在此处或那里更改了一行代码，请根据传入函数的枚举将所有#if转换为if s要运行哪种变体，希望编译器在优化方面做得很好。希望它会产生几乎相同的代码，因为多次定义函数除了使用单个运行时条件来决定运行哪个。没有承诺。

如果您在算法中#if代码块，则将算法拆分为较小的函数，以便整个算法的不同实现可以调用。如果你的#if是如此具有侵入性，以至于最终会有50个函数，这显然是不切实际的。

Answer 4

如果将算法本身放在具有相同接口的类中，则可以使用算法将它们作为模板参数传递到该位置。

class foo {
public:
  void do_something() {
    std::cout << "foo!" << std::endl;
  }
}

class bar {
public:
  void do_something() {
    std::cout << "bar!" << std::endl;
}

template <class meh>
void something() {
  meh algorithm;
  meh.do_something();
}

int main() {
  std::vector<std::string> config_values = get_config_values_from_somewhere();
  for (const austo& config : config_values) { // c++11 for short notation
    switch (config) {
      case "foo":
        something<foo>();
        break;
      case "bar":
        something<bar>();
        break;
      default:
        std::cout << "undefined behaviour" << std::endl;
    }
  }
}

通过这种方式，您可以同时使用不同的行为，并通过名称区分它们。此外，如果您不使用其中一个，它将在编译时由优化器删除（但不是在您的问题中）。

在阅读配置文件时，您只需要一个工厂（或类似工具）来创建应该在使用算法之前使用算法的对象/函数的正确实例。

编辑：添加了基本开关。

Answer 5

您还没有提到您正在使用的编译器，但您可以在命令行中为其中任何一个设置#defines。在gcc中，您只需要添加-D MYTESTFOO来定义MYTESTFOO。这将使#defines成为可行的方法 - 没有代码更改传播，当然，每个测试都会有不同的编译代码，但它应该很容易自动化。

Answer 6

一种方法是不在可执行文件中包含预处理程序指令，并这样做：

#define METHOD METHOD1
int Method1() { return whatever(); };
#undef METHOD

#define METHOD METHOD2
int Method2() { return whatever(); };
#undef METHOD

假设whatever依赖METHOD，那么这些会产生不同的结果。

处理#ifdef用于创建算法的多个版本

6 个答案: