Question

我有一种方法，其中性能非常重要（我知道过早的优化是所有邪恶的根源。我知道我应该和我确实对我的代码进行分析。在这个应用程序中，每十分之一秒，我节省的是一个巨大的胜利。）此方法使用不同的启发式方法来生成和返回元素。顺序使用启发式：使用第一个启发式直到它不再返回元素，然后使用第二个启发式直到它不再返回元素等等，直到使用了所有启发式算法。在方法的每次调用中，我使用开关移动到右侧启发式。这很难看，但效果很好。这是一些伪代码

class MyClass
{
private:
   unsigned int m_step;
public:
   MyClass() : m_step(0) {};

   Elem GetElem()
   {
      // This switch statement will be optimized as a jump table by the compiler.
      // Note that there is no break statments between the cases.
      switch (m_step)
      {
      case 0:
         if (UseHeuristic1())
         {
            m_step = 1; // Heuristic one is special it will never provide more than one element.
            return theElem;
         }

         m_step = 1;

      case 1:
         DoSomeOneTimeInitialisationForHeuristic2();
         m_step = 2;

      case 2:
         if (UseHeuristic2())
         {
            return theElem;
         }

         m_step = 3;

      case 3:
         if (UseHeuristic3())
         {
            return theElem;
         }
         m_step = 4; // But the method should not be called again
      }

      return someErrorCode;
   };
}

正如我所说的，这是有效的，因为在每次调用时，执行都会跳到正确的位置。如果启发式无法提供元素，则m_step递增（因此下次我们不再尝试此启发式）并且因为没有break语句，所以尝试下一个启发式。另请注意，某些步骤（如步骤1）从不返回元素，而是为下一个启发式进行一次初始化。

初始化并非全部预先完成的原因是它们可能永远不需要。 GetElem在返回元素后不会再次被调用，这总是可能的（也是常见的），即使它还有可能返回的元素。

虽然这是一个有效的实现，但我发现它真的很难看。案件陈述是一个黑客;使用它不间断也是hackish;即使每个启发式方法都封装在自己的方法中，该方法也会很长。

我应该如何重构此代码，使其更具可读性和优雅性，同时尽可能提高效率？

Answer 1

在迭代器中包装每个启发式。在第一次调用hasNext()时完全初始化它。然后收集列表中的所有迭代器并使用超级迭代器迭代所有迭代器：

boolean hasNext () {
    if (list.isEmpty()) return false;

    if (list.get(0).hasNext()) return true;

    while (!list.isEmpty()) {
        list.remove (0);
        if (list.get(0).hasNext()) return true;
    }
    return false;
}
Object next () {
    return list.get (0).next ();
}

注意：在这种情况下，链表可能比ArrayList快一点，但您仍应检查它。

[编辑]将“将每个人”改为“包装每个”以使我的意图更加明确。

Answer 2

我认为你的代码不是那么糟糕，但如果你做了很多这样的事情，而你想隐藏机制以便逻辑更清晰，你可以看看Simon Tatham's coroutine macros 。它们用于C（使用静态变量）而不是C ++（使用成员变量），但改变它是微不足道的。

结果应如下所示：

Elem GetElem()
{
  crBegin;

  if (UseHeuristic1())
  {
     crReturn(theElem);
  }

  DoSomeOneTimeInitialisationForHeuristic2();

  while (UseHeuristic2())
  {
     crReturn(theElem);
  }

  while (UseHeuristic3())
  {
     crReturn(theElem);
  }

  crFinish;
  return someErrorCode;
}

Answer 3

看起来在这段代码中没有太多优化 - 可能大多数优化都可以在UseHeuristic函数中完成。它们里面有什么？

Answer 4

在我看来，如果您不需要修改此代码，例如添加新的启发式，那么请将其记录好并且不要触摸它。

但是，如果添加和删除新的启发式方法并且您认为这是一个容易出错的过程，那么您应该考虑重构它。对此的明显选择是引入州设计模式。这将用多态来替换你的switch语句，这可能会减慢速度，但你必须对两者进行分析才能确定。

Answer 5

您可以将控制流程内外翻。

template <class Callback>  // a callback that returns true when it's done
void Walk(Callback fn)
{
    if (UseHeuristic1()) {
        if (fn(theElem))
            return;
    }
    DoSomeOneTimeInitialisationForHeuristic2();
    while (UseHeuristic2()) {
        if (fn(theElem))
            return;
    }
    while (UseHeuristic3()) {
        if (fn(theElem))
            return;
    }
}

如果switch调度和return语句使CPU脱离其步幅，并且收件人可以内联，这可能会为您带来几纳秒的时间。

当然，如果启发式算法本身是非常重要的，那么这种优化是徒劳的。而且很大程度上取决于来电者的样子。

Answer 6

这是微优化，但是当你没有从GetElem返回时，不需要设置m_elem值。请参阅下面的代码。

更大的优化肯定需要简化控制流（减少跳转，减少返回，减少测试，减少函数调用），因为一旦完成跳转，处理器缓存就会被清空（有些处理器有分支）预测，但它不是银弹）。你可以试试Aaron或Jason提出的解决方案，还有其他的（例如你可以实现几个get_elem函数，并通过函数指针调用它们，但我很确定它会慢一点。）

如果问题允许，在启发式中一次计算多个元素并使用一些缓存，或者使其与某些线程计算元素真正并行，这也只是客户等待结果，这也是高效的...没有关于背景的一些细节，没办法说更多。

class MyClass
{
private:
   unsigned int m_step;
public:
   MyClass() : m_step(0) {};

   Elem GetElem()
   {
      // This switch statement will be optimized as a jump table by the compiler.
      // Note that there is no break statments between the cases.
      switch (m_step)
      {
      case 0:
         if (UseHeuristic1())
         {
            m_step = 1; // Heuristic one is special it will never provide more than one element.
            return theElem;
         }

      case 1:
         DoSomeOneTimeInitialisationForHeuristic2();
         m_step = 2;

      case 2:
         if (UseHeuristic2())
         {
            return theElem;
         }

      case 3:
         m_step = 4;

      case 4:
         if (UseHeuristic3())
         {
            return theElem;
         }
         m_step = 5; // But the method should not be called again
      }

      return someErrorCode;
   };
}

Answer 7

你真正可以做的是用状态模式替换条件。

http://en.wikipedia.org/wiki/State_pattern

由于虚拟方法调用可能会降低性能，可能因为状态维护代码较少而性能更好，但代码肯定会更清晰，更易于维护，就像模式一样。

什么可以提高性能，是消除DoSomeOneTimeInitialisationForHeuristic2（）; 两者之间的分离状态。 1和2。

Answer 8

由于每个启发式都由具有相同签名的函数表示，因此您可以创建一个函数指针表并遍历它。

class MyClass 
{ 
private: 
   typedef bool heuristic_function();
   typedef heuristic_function * heuristic_function_ptr;
   static heuristic_function_ptr heuristic_table[4];
   unsigned int m_step; 
public: 
   MyClass() : m_step(0) {}; 

   Elem GetElem() 
   { 
      while (m_step < sizeof(heuristic_table)/sizeof(heuristic_table[0]))
      {
         if (heuristic_table[m_step]())
         {
            return theElem;
         }
         ++m_step;
      }

      return someErrorCode; 
   }; 
}; 

MyClass::heuristic_function_ptr MyClass::heuristic_table[4] = { UseHeuristic1, DoSomeOneTimeInitialisationForHeuristic2, UseHeuristic2, UseHeuristic3 };

Answer 9

如果您正在处理的元素代码可以转换为整数值，那么您可以根据元素构造函数指针和索引的表。该表将为每个“已处理”元素提供一个条目，并为每个已知但未处理的元素提供一个条目。对于未知元素，请在索引函数指针表之前进行快速检查。

调用元素处理函数很快。

以下是工作示例代码：

#include <cstdlib>
#include <iostream>
using namespace std;

typedef void (*ElementHandlerFn)(void);

void ProcessElement0()
{
    cout << "Element 0" << endl;
}

void ProcessElement1()
{
    cout << "Element 1" << endl;
}
void ProcessElement2()
{
    cout << "Element 2" << endl;
}

void ProcessElement3()
{
    cout << "Element 3" << endl;
}

void ProcessElement7()
{
    cout << "Element 7" << endl;
}

void ProcessUnhandledElement()
{
    cout << "> Unhandled Element <" << endl;
}




int main()
{
    // construct a table of function pointers, one for each possible element (even unhandled elements)
    // note: i am assuming that there are 10 possible elements -- 0, 1, 2 ... 9 --
    // and that 5 of them (0, 1, 2, 3, 7) are 'handled'.

    static const size_t MaxElement = 9;
    ElementHandlerFn handlers[] = 
    {
        ProcessElement0,
        ProcessElement1,
        ProcessElement2,
        ProcessElement3,
        ProcessUnhandledElement,
        ProcessUnhandledElement,
        ProcessUnhandledElement,
        ProcessElement7,
        ProcessUnhandledElement,
        ProcessUnhandledElement
    };

    // mock up some elements to simulate input, including 'invalid' elements like 12
    int testElements [] = {0, 1, 2, 3, 7, 4, 9, 12, 3, 3, 2, 7, 8 };
    size_t numTestElements = sizeof(testElements)/sizeof(testElements[0]);

    // process each test element
    for( size_t ix = 0; ix < numTestElements; ++ix )
    {
        // for some robustness...
        if( testElements[ix] > MaxElement )
            cout << "Invalid Input!" << endl;
        // otherwise process normally
        else
            handlers[testElements[ix]]();

    }

    return 0;
}

Answer 10

如果没有破坏，请不要修复它。

它看起来非常高效。看起来也不难理解。添加迭代器等可能会让人更难理解。

你可能最好不要做

表现分析。是时候真的花在这个程序上了，还是大部分时间都在它调用的函数中？我在这里看不到任何重要的时间。
更多单元测试，以防止有人在必须修改它时将其破坏。
代码中的其他评论。

我如何在考虑性能的情况下重构此代码？

10 个答案: