Question

我在SO上找到了类似主题的答案但是找不到令人满意的答案。由于我知道这是一个相当大的话题，我将尝试更具体。

我想编写一个处理文件的程序。处理是非常重要的，所以最好的方法是将不同的阶段分成独立的模块，然后根据需要使用（因为有时我只对模块A的输出感兴趣，有时我需要输出其他五个模块，等等）。问题是，我需要模块合作，因为一个模块的输出可能是另一个模块的输入。我需要它快速。此外，我想避免多次执行某些处理（如果模块A创建了一些数据，然后需要由模块B和C处理，我不想运行模块A两次来创建模块B，C的输入）

模块需要共享的信息主要是二进制数据块和/或处理文件的偏移量。主程序的任务非常简单 - 只需解析参数，运行所需的模块（也许可以给出一些输出，或者这应该是模块的任务？）。

我不需要在运行时加载模块。使用带有.h文件的库并且每次有新模块或某个模块更新时重新编译程序都是完美的。模块的概念主要是因为代码可读性，维护和能够让更多的人在不同的模块上工作，而不需要有一些预定义的接口或其他任何东西（另一方面，关于如何编写的一些“指南”）可能需要模块，我知道）。我们可以假设文件处理是只读操作，原始文件不会改变。

有人能指出我如何在C ++中做到这一点吗？任何建议都很好（链接，教程，pdf书籍......）。

Answer 1

我想知道C ++是否是为此目的考虑的正确级别。根据我的经验，在UNIX哲学中，将单独的程序连接在一起是非常有用的。

如果您的数据不是太大，分割有很多优点。您首先能够独立测试处理的每个阶段，运行一个程序将输出重定向到文件：您可以轻松检查结果。然后，即使每个程序都是单线程的，您也可以利用多个核心系统，因此更容易创建和调试。您还可以使用程序之间的管道来利用操作系统同步。也许您的一些程序可以使用现有的实用程序来完成？

您的最终程序将创建粘合剂以将所有实用程序收集到一个程序中，将数据从程序传送到另一个程序（此时不再有文件），并根据您的所有计算复制它。

Answer 2

这看起来非常类似于插件架构。我建议从（非正式）数据流程图开始，以确定：

这些块如何处理数据
需要转移的数据
从一个块返回到另一个块的结果（数据/错误代码/例外）

使用这些信息，您可以开始构建通用接口，这些接口允许在运行时绑定到其他接口。然后我会为每个模块添加一个工厂函数，以从中请求真正的处理对象。我不建议将处理对象直接从模块接口中取出，但要返回工厂对象，可以检索处理对象。然后，这些处理对象用于构建整个处理链。

过于简化的轮廓将如下所示：

struct Processor
{
    void doSomething(Data);
};

struct Module
{
    string name();
    Processor* getProcessor(WhichDoIWant);
    deleteprocessor(Processor*);
};

我不介意这些模式可能会出现：

工厂功能：从模块中获取对象
复合材料＆amp;＆amp;装饰者：形成加工链

Answer 3

这看起来非常简单，所以我想我们会错过一些要求。

使用Memoization避免多次计算结果。这应该在框架中完成。

您可以使用一些流程图来确定如何使信息从一个模块传递到另一个模块......但最简单的方法是让每个模块直接调用它们所依赖的模块。通过记忆，它不会花费太多，因为如果已经计算好了，你就没事了。

由于您需要能够启动任何模块，您需要为它们提供ID并在某处注册它们，以便在运行时查找它们。有两种方法可以做到这一点。

范例：您将获得此类模块的独特示例并执行它。
工厂：您创建一个请求类型的模块，执行它并扔掉它。

Exemplar方法的缺点是，如果你执行两次模块，你将不会从一个干净的状态开始，而是从最后一次（可能是失败的）执行的状态开始。对于memoization它可能被视为一个优势，但如果它失败了，结果就不会计算（urgh），所以我建议不要这样做。

那你怎么......？

让我们从工厂开始。

class Module;
class Result;

class Organizer
{
public:
  void AddModule(std::string id, const Module& module);
  void RemoveModule(const std::string& id);

  const Result* GetResult(const std::string& id) const;

private:
  typedef std::map< std::string, std::shared_ptr<const Module> > ModulesType;
  typedef std::map< std::string, std::shared_ptr<const Result> > ResultsType;

  ModulesType mModules;
  mutable ResultsType mResults; // Memoization
};

这是一个非常基本的界面。但是，由于每次调用Organizer时都需要模块的新实例（以避免重入问题），我们需要在Module接口上工作。

class Module
{
public:
  typedef std::auto_ptr<const Result> ResultPointer;

  virtual ~Module() {}               // it's a base class
  virtual Module* Clone() const = 0; // traditional cloning concept

  virtual ResultPointer Execute(const Organizer& organizer) = 0;
}; // class Module

现在，这很简单：

// Organizer implementation
const Result* Organizer::GetResult(const std::string& id)
{
  ResultsType::const_iterator res = mResults.find(id);

  // Memoized ?
  if (res != mResults.end()) return *(it->second);

  // Need to compute it
  // Look module up
  ModulesType::const_iterator mod = mModules.find(id);
  if (mod != mModules.end()) return 0;

  // Create a throw away clone
  std::auto_ptr<Module> module(it->second->Clone());

  // Compute
  std::shared_ptr<const Result> result(module->Execute(*this).release());
  if (!result.get()) return 0;

  // Store result as part of the Memoization thingy
  mResults[id] = result;

  return result.get();
}

一个简单的模块/结果示例：

struct FooResult: Result { FooResult(int r): mResult(r) {} int mResult; };

struct FooModule: Module
{
  virtual FooModule* Clone() const { return new FooModule(*this); }

  virtual ResultPointer Execute(const Organizer& organizer)
  {
    // check that the file has the correct format
    if(!organizer.GetResult("CheckModule")) return ResultPointer();

    return ResultPointer(new FooResult(42));
  }
};

来自主要：

#include "project/organizer.h"
#include "project/foo.h"
#include "project/bar.h"


int main(int argc, char* argv[])
{
  Organizer org;

  org.AddModule("FooModule", FooModule());
  org.AddModule("BarModule", BarModule());

  for (int i = 1; i < argc; ++i)
  {
    const Result* result = org.GetResult(argv[i]);
    if (result) result->print();
    else std::cout << "Error while playing: " << argv[i] << "\n";
  }
  return 0;
}

如何编写灵活的模块化程序，模块之间具有良好的交互可能性？

3 个答案: