Question

我正在研究消息解析器/生成器子系统。我正在创建一个自动生成器，它使用包含有关此协议的所有信息的数据库，包括枚举列表，以生成代码。我遇到的一件事是需要分层枚举。

更新

（我试图通过不描述完整的问题来简化事情，但下面的评论显然我错误地简化了太多。）

正在使用的数据库将存储为简化字符串（客户决策），但协议仅说“字节三元组”（又名Hierarchical Enum）。完整的问题可以这样描述：

给定一组唯一的字符串，每个字符串对应一个唯一的三元组，1）找到任何给定字符串的三元组，和 2）找到任何给定三元组的字符串。确保考虑“Undefined”和“No Statement”枚举（没有与之关联的字符串）。 [正如一张海报所说，是的疯狂。]

（警告：我已经做了十多年的C ++，但去年我一直在做Java - 我的C ++可能已经“损坏”了。）

所以，使用一个公认的人为例子，给出：

// There is only one category
// POP= "P", COUNTRY= "K", CLASSICAL= "C"
enum Category {POP, COUNTRY, CLASSICAL};

// There is one Type enum for each Category.
// ROCK= "R", BIG_BAND = "B", COUNTRY_POP= "C" 
enum PopType {ROCK, BIG_BAND, COUNTRY_POP};
enum CountryType {CLASSICAL_COUNTRY, MODERN_COUNTRY, BLUEGRASS, COUNTRY_AND_WESTERN};
// ...

// There is one Subtype for each Type
// EIGHTIES= "E", HEAVY_METAL= "H", SOFT_ROCK= "S"
enum RockSubType { EIGHTIES, HEAVY_METAL, SOFT_ROCK};
// ...

当我得到0,0,0（Pop，Rock，Eighties）时，我需要将其转换为“PRE”。相反，如果我在数据库中看到“PC”，则需要将其作为0,2（Pop，Country，NULL）发送出去。

我公然忽略了“Undefined”和No Statement“。从字符串生成三元组似乎是直接的（使用无序映射，字符串为三元组）。从三元组生成一个字符串（可能包含一个在最后一个条目中为NULL）...没有那么多。我知道的大多数“枚举技巧”都不起作用：例如，类型重复值 - 每个类型枚举从零开始 - 所以我无法索引一个基于Enum值的数组来获取字符串。

我得到的是这种关系。乍一看，这似乎是一个相当直接的“is-a”关系，但这不起作用，因为这种情况是双向的。叶子 - ＆gt;根导航非常简单，适用于类层次结构;不幸的是，走另一条路并不是那么直接。

我不能“手动”这个 - 我必须生成代码 - 所以这可能会消除任何基于XML的解决方案。它也必须“相当快”。 “Java解决方案”涉及使用受保护的静态变量，在构造时初始化和抽象基类;但是，我不相信这会在C ++（初始化顺序等）中起作用。另外，从美学角度来说，我觉得这应该是...更多“常量”。我见过的解决这个问题的其他代码使用了联合，明确地列出了联合中的所有枚举类型。

我能想到的唯一另一件事是使用模板专业化和明确的专业化，但我不知所措。我对此进行了网络搜索，但我发现没有什么可以告诉我它是否会起作用。但是，如果可以使用union完成，那么不能使用Template Specialization来完成吗？

是否可以使用模板，专业化，显式专业化来做这样的事情？是否有另一种更明显的解决方案（即我忘记的设计模式），我不知道？

哦，在我忘记之前 - 解决方案必须是便携式的。更具体地说，它必须适用于Windows（Visual Studio 2010）和Redhat Enterprise 6 / Centos 6（GCC 4.4.4 IIRC）。

而且，为免我遗忘，这个协议是巨大的。理论上最大值约为133,000个条目;一旦我加入“Undefined”和“No Statement”，我可能会有很多条目。

感谢。

Answer 1

实际上，你在这里有点紧张。

我的建议意味着首先使用3个枚举：

分类
类型
子类型

各种类型或子类型之间没有区别（起初）（我们只是将它们全部放在同一个篮子里）。

然后，我只想使用一个结构：

struct MusicType {
  Category category;
  Type type;
  SubType subtype;
};

定义一个简单的set有效类型：

struct MusicTypeLess {
  bool operator()(MusicType const& left, MusicType const& right) const {
    if (left.category < right.category) { return true; }
    if (left.category > right.category) { return false; }

    if (left.type < right.type) { return true; }
    if (left.type > right.type) { return false; }

    return left.subtype < right.subtype;
  }
};

MusicType MusicTypes[] = {
  { Category::Pop, Type::Rock, SubType::EightiesRock },
  ...
};

// Sort it on initialization or define in sorted during generation

然后你可以定义简单的查询：

typedef std::pair<MusicType const*, MusicType const*> MusicTypeRange;

MusicTypeRange listAll() {
  return MusicTypeRange(MusicTypes, MusicTypes + size(MusicTypes));
}

namespace {
  struct MusicTypeCategorySearch {
    bool operator()(MusicType const& left, MusicType const& right) const {
      return left.category < right.category;
    }
  };
}

MusicTypeRange searchByCategory(Category cat) {
  MusicType const search = { cat, /* doesn't matter */ };
  return std::equal_range(MusicTypes,
                          MusicTypes + size(MusicTypes),
                          search,
                          MusicTypeCategorySearch());
}

namespace {
  struct MusicTypeTypeSearch {
    bool operator()(MusicType const& left, MusicType const& right) const {
      if (left.category < right.category) { return true; }
      if (left.category > right.category) { return false; }

      return left.type < right.type;
    }
  };
}

MusicTypeRange searchByType(Category cat, Type type) {
  MusicType const search = { cat, type, /* doesn't matter */ };
  return std::equal_range(MusicTypes,
                          MusicTypes + size(MusicTypes),
                          search,
                          MusicTypeTypeSearch ());
}

// little supplement :)
bool exists(MusicType const& mt) {
  return std::binary_search(MusicTypes, MusicTypes + size(MusicTypes), mt);
}

因为数组是排序的，所以操作很快（log N），所以它应该顺利进行。

Answer 2

我认为Music类应该包含子类型...（has-a）也称为聚合。

Answer 3

叶子 - ＆gt;根导航非常简单，适用于类层次结构;不幸的是，走另一条路并不是那么直接。

我不确定你首先使用枚举获得了什么价值。有没有令人信服的理由不只是发明一个Category类，然后将它们的实例连接在一起来模拟你想要实现的目标？（我想起了Qt State Machine Framework ...）

在我看来，它的好处在于它是多么简单，并且随着需求的变化而易于适应。这是无聊的代码。你并没有真正推动语言的编译时功能。但是你说这是生成的代码，所以不必担心有人会引入带有循环类heirarchy的bug。只要确保没有生成这样的东西。

更新好的，我看了你的方案更新，听起来好像你正在看这里的数据库任务。 “enum”这个词甚至没有想到这一点。你考虑过SQLite吗？

http://en.wikipedia.org/wiki/SQLite

尽管如此，抛开你在这个疯狂的133,000个音乐类型列表的位置的问题，我修改了我的代码，为你提供了一个具体的性能指标，用于说明C ++如何处理该规模的运行时对象。你最终会最大限度地解决问题，但在大多数机器上，它仍然是相当活泼的......尝试一下：

#include <iostream>
#include <sstream>
#include <string>
#include <vector>
#include <set>
#include <algorithm>
#include <cstdlib>
using namespace std;

class Category {
private:
    string name;
    Category* parent;
    set<Category*> children;
private:
    static set<Category*> allCategories;
    static vector<Category*>* allCategoriesVector;
public:
    Category (string name, Category* parent) :
        name (name), parent (NULL)
    {
        resetParent(parent);
    }
    void resetParent(Category* newParent) {
        if (parent) {
            parent->children.erase(this);
            if (newParent == NULL) {
                allCategories.erase(this);
                if (allCategoriesVector != NULL) {
                    delete allCategoriesVector;
                    allCategoriesVector = NULL;
                }
            }
        } else {
            if (newParent != NULL) {
                allCategories.insert(this);
                if (allCategoriesVector != NULL) {
                    allCategoriesVector->push_back(this);
                }
            }
        }
        set<Category*>::iterator i = children.begin();
        while (i != children.end()) {
            (*i)->parent = NULL;
            i++;
        } 

        if (newParent) {
            newParent->children.insert(this);
        }

        parent = newParent;
    }
    Category* getRoot() {
       Category* result = this;
       while (result->parent != NULL) {
           result = result->parent;
       }
       return result;
    }
    const string& getNamePart() const {
        return name;
    }
    string getNamePath() const {
        if (parent) {
            return parent->getNamePath() + ":" + getNamePart();
        } else {
            return getNamePart();
        }
    }
    static const vector<Category*>& getAllCategoriesVector() {
        if (allCategoriesVector == NULL) {
           allCategoriesVector = new vector<Category*> (
               allCategories.begin(), allCategories.end()
           );
        }
        return *allCategoriesVector;
    }
    static Category* randomCategory() {
        if (allCategories.empty())
            return NULL;

        // kids: don't try this at home if you want a uniform distribution
        // http://stackoverflow.com/questions/5008804/generating-random-integer-from-a-range
        return getAllCategoriesVector()[rand() % allCategories.size()];
    }
    virtual ~Category() {
        resetParent(NULL);
    }
};
set<Category*> Category::allCategories;
vector<Category*>* Category::allCategoriesVector = NULL;

class CategoryManager {
public:
    Category Root;
        Category Pop;
            Category Rock;
                Category EightiesRock;
                Category HeavyMetal;
                Category SoftRock;
            Category CountryPop;
            Category BigBand;
        Category Country;
        Category Classical;
        Category Jazz;

private:
    set<Category*> moreCategories;
public:
    CategoryManager (int numRandomCategories = 0) :
        Root ("Category", NULL),
            Pop ("Pop", &Root),
                Rock ("Rock", &Pop),
                    EightiesRock ("EightiesRock", &Rock),
                    HeavyMetal ("HeavyMetal", &Rock),
                    SoftRock ("SoftRock", &Rock),
                CountryPop ("CountryPop", &Pop),
                BigBand ("BigBand", &Pop),
            Country ("Country", &Root),
            Classical ("Classical", &Root),
            Jazz ("Jazz", &Root)
    {
        // claim is that there are "hundreds" of these
        // lets make a bunch of them starting with no parent
        for (int i = 0; i < numRandomCategories; i++) {
            stringstream nameStream;
            nameStream << "RandomCategory" << i;
            moreCategories.insert(new Category(nameStream.str(), NULL));
        }

        // now that we have all the categories created, let's
        // reset their parents to something chosen randomly but
        // keep looking until we find one whose path goes up to Root
        set<Category*>::iterator i (moreCategories.begin());
        while (i != moreCategories.end()) {
            (*i)->resetParent(Category::randomCategory());
            i++;
        }
    }
    virtual ~CategoryManager () {
        set<Category*>::iterator i = moreCategories.begin();
        while (i != moreCategories.end()) {
            delete *i;
            i++;
        }
    }
};

int main() {
    CategoryManager cm (133000);

    // how to get to a named category
    cout << cm.EightiesRock.getNamePath() << "\n" << "\n";

    // pick some random categories to output
    for (int i = 0; i < 5; i++) {
        cout << Category::randomCategory()->getNamePath() << "\n";
    }

    return 0;
}

在我的机器上，这很快就吐出来了：

Category:Pop:Rock:EightiesRock

Category:Pop:Rock:HeavyMetal:RandomCategory0:RandomCategory6:RandomCategory12:RandomCategory95:RandomCategory116:RandomCategory320:RandomCategory358:RandomCategory1728:RandomCategory6206:RandomCategory126075
Category:Country:RandomCategory80:RandomCategory766:RandomCategory2174
Category:Country:RandomCategory22:RandomCategory45:RandomCategory52:RandomCategory83:RandomCategory430:RandomCategory790:RandomCategory860:RandomCategory1628:RandomCategory1774:RandomCategory4136:RandomCategory10710:RandomCategory13124:RandomCategory19856:RandomCategory20810:RandomCategory43133
Category:Pop:Rock:HeavyMetal:RandomCategory0:RandomCategory5:RandomCategory138:RandomCategory142:RandomCategory752:RandomCategory2914:RandomCategory9516:RandomCategory13211:RandomCategory97800
Category:Pop:CountryPop:RandomCategory25:RandomCategory63:RandomCategory89:RandomCategory2895:RandomCategory3842:RandomCategory5735:RandomCategory48119:RandomCategory76663

我仍然会说数据库是你在这里寻找的答案，但与此同时，你会惊讶于编译器现在会滥用多少。每行作为对象声明的133K文件比它听起来更容易处理。

Answer 4

你的查找是运行时，所以我真的不认为很多静态类型会对你有所帮助。如果你真的想要它们，我相信你可以把它们写在下面。

我不认为程序员会在日常编码中直接指定这些。他们将采用运行时生成的值并对其进行转换？

考虑到这个假设，我会对枚举进行非规范化。这可能需要一些权衡，以获得有关何时switch语句缺少其中一个值的警告。

struct MusicType {
  enum EnumValue {
    ROOT = 0
    ,Pop
    ,Pop_Rock
    ,Pop_Rock_EightiesRock
    ,Pop_Rock_HeavyMetal
    ,Pop_Rock_SoftRock
    ,Pop_CountryPop
    ,Pop_BigBand
    ,Country
    ,Classical
    ,Jazz
  };
  std::string getLeafString(EnumValue ev) {
    case (ev) {
      case Pop:         return "Pop";
      case Pop_Rock:    return "Rock";
      // ...
      default:
        throw std::runtime_error("Invalid MusicType (getLeafString)");
    }
  }
  // you could write code to do this easily without generating it too
  std::string getFullString(EnumValue ev) {
    case (ev) {
      case Pop:         return "Pop";
      case Pop_Rock:    return "Pop::Rock";
      // ...
      default:
        throw std::runtime_error("Invalid MusicType (getFullString)");
    }
  }

};

那么你需要映射你的人际关系。这听起来像是水平的数量是坚定的，但是当这种假设破坏时，修复它真的很昂贵。

有几种方法可以解决这个问题。我认为数据结构是最直接实现的，尽管你可以做一个巨大的转换。我认为对于类似的表现来说会更麻烦。实际上，switch语句只是代码段中的一个映射，但是选择你的毒药。

我喜欢解决这样的问题，一次只解决一个级别。这使您可以拥有任意数量的级别。它使这个最低级别的抽象更简单。它确实会让你写出更多的“中间件”，但这应该更容易实现。

void getChildren(MusicType::EnumValue ev, std::vector<MusicType::EnumValue> &children) {
  typedef std::multimap<MusicType::EnumValue, MusicType::EnumValue> relationships_t;
  typedef std::pair<MusicType::EnumValue, MusicType::EnumValue> mpair_t;
  static relationships_t relationships;
  static bool loaded = false;
  if (!loaded) {
    relationships.insert(mpair_t(MusicType::Pop, MusicType::Pop_Rock));
    relationships.insert(mpair_t(MusicType::Pop_Rock, MusicType::Pop_Rock_EightiesRock));
    // ..
  }
  // returning these iterators as a pair might be a more general interface
  relationships::iterator cur = relationships.lower_bound(ev);
  relationships::iterator end = relationships.upper_bound(ev);
  for (; cur != end; cur++) {
    children.push_back(cur->second);
  }
} 

MusicType::EnumValue getParent(MusicType::EnumValue ev) {
  case (ev) {
    case Pop:         return MusicType::ROOT;
    case Pop_Rock:    return MusicType::Pop;
    // ...
    default:
      throw std::runtime_error("Invalid MusicType (getParent)");
    }
}

像这样分离它的重要部分是你可以为这些编写任何类型的组合助手，而不必过多担心结构。

对于GUI反馈，这应该足够快。如果你需要它更快，那么你可以做一些控制反转，以避免一些副本。我不认为我会从那里开始。

您可以添加额外的功能而不会在内部进行太多更改，这通常是我对生成代码的主要关注点。开放/封闭原则对于生成的代码非常重要。

Answer 5

我无法理解你的意图，但这是在黑暗中随机拍摄的。 MusicCategory是一个保存Enum value值的类。 PopTypes从MusicCategory公开继承，RockTypes PopTypes也是如此。只要程序只存储/传递MusicCategory类型，您就可以从任何派生类类型中为其分配所有类型。因此，您可以MusicCategory Cat = RockTypes::SoftRock;，如果仔细定义了枚举，它甚至可以恰当地设置Pop / Rock。

struct MusicCategory{
   enum Enum {
              NoCategory = 0 | (0<<12),  //"0 |" isn't needed, but shows pattern
              Pop        = 0 | (1<<12), 
              Country    = 0 | (2<<12), 
              Classical  = 0 | (3<<12), 
              Jazz       = 0 | (4<<12),
              All        = INT_MAX} value; 
  //"ALL" forces enum to be big enough for subtypes
   MusicCategory(Enum e) :value(e) {} //this makes the magic work
   operator Enum&() {return value;}
   operator const Enum&() const {return value;}
   operator const int() const {return value;}
   const std::string & getString(MusicCategory::Enum category);
};

// Begin types
// This one is a subtype of MusicCategory::Pop
struct PopTypes : public MusicCategory {
   enum Enum { 
       NoType     = MusicCategory::Pop | (0<<6), 
       Rock       = MusicCategory::Pop | (1<<6), 
       CountryPop = MusicCategory::Pop | (2<<6), 
       BigBand    = MusicCategory::Pop | (3<<6),
       All        = INT_MAX};
   const std::string & getString(PopTypes::Enum category);
};
// ...

// Begin subtypes
struct RockTypes : public PopType {
   enum Enum { 
       NoSubType    = PopTypes::Rock | (0<<0),  //"<<0)" isn't needed, but shows pattern
       EightiesRock = PopTypes::Rock | (1<<0),
       HeavyMetal   = PopTypes::Rock | (2<<0), 
       SoftRock     = PopTypes::Rock | (3<<0),
       All          = INT_MAX};
   const std::string & getString(RockTypes::Enum category);
};

int main() {
    MusicCategory Cat; 
    // convertable to and from an int
    Cat = RockTypes::HeavyMetal;
    //automatically sets MusicCategory::Pop and PopTypes::Rock
    bool is_pop = (Cat & MusicCategory::Pop == MusicCategory::Pop);
    //returns true
    std:string str = MusicCategory::getString(Cat);
    //returns Pop
    str = PopTypes::getString(Cat);
    //returns Rock
    str = RockTypes::getString(Cat);
    //returns HeavyMetal
}

Answer 6

首先，感谢大家的帮助。由于这个问题的性质，我实际上无法“按原样”使用任何答案：

枚举重复它们的值（每个枚举可以具有与其兄弟姐妹相同的数值，但具有不同的标签和“含义”）
与枚举相关联的字符串也可以重复（给定的枚举可以与兄弟一样具有相同的字符串，但具有不同的含义）。

我最终找到Boost bimaps，结果发现bimap层次结构适用于此问题。对于那些没有看过它们的人来说，Boost`bimap'是一个双向容器，它使用该对作为键，另一个作为值。

我可以创建bimap“整数，字符串”（在这种情况下为uint8_t，因为这里的枚举都保证很小）并添加errr，“sub-enum”作为相关信息bimap使用with_info。

层次结构代码如下所示：

// Tags
struct category_enum_value {};
struct type_enum_value {};
struct subtype_enum_value {};
struct category_string {};
struct music_type_string {};
struct music_subtype_string {};
struct music_type_info {};
struct music_subtype_info {};

// Typedefs
typedef bimap<
    unordered_set_of< tagged<uint8_t, subtype_enum_value> >,
    unordered_set_of< tagged<std::string, music_subtype_string> >
> music_subtype;
typedef music_subtype::value_type music_subtype_value;

typedef bimap<
    unordered_set_of< tagged<uint8_t, type_enum_value> >,
    unordered_set_of< tagged<std::string, music_type_string> >,
    with_info< tagged<music_subtype, music_subtype_info> >
> music_type_type;
typedef music_type_type::value_type music_type_value;

typedef bimap<
    unordered_set_of< tagged<uint8_t, category_enum_value> >,
    unordered_set_of< tagged<std::string, category_string> >,
    with_info< tagged<music_type_type, music_type_info> > 
> category_type;
typedef category_type::value_type category_value;

出于性能原因，我选择了unordered_set。由于这严格来说是一个“常量”层次结构，因此我不必担心插入和删除时间。因为我永远不会比较订单，所以我不必担心排序。

要按枚举值获取类别信息（在给定枚举时获取字符串值），我使用category_enum_value标记：

    category_type::map_by<category_enum_value>::iterator cat_it = categories.by<category_enum_value>().find(category);
if(cat_it != categories.by<category_enum_value>().end())
{
    const std::string &categoryString = cat_it->get_right();
            // ...

通过这样做，我使用type_enum_value标签（子类型几乎相同）从中获取相应的类型信息：

    music_type_type &music_type_reference = cat_it->get<music_type_info>();
    music_type_type::map_by<type_enum_value>::iterator type_it = music_type_reference.by<type_enum_value>().find(type);
    if(type_it != music_type_reference.by<type_enum_value>().end())
    {
               // ... second verse, same as the first ...

要获取给定字符串的枚举值，请将标记更改为category_string并使用与以前类似的方法：

    std::string charToFind = stringToFind.substr(0, 1);
    category_type::map_by<category_string>::iterator cat_it = categories.by<category_string>().find(charToFind);
    if(cat_it != categories.by<category_string>().end())
    {
        retval.first = cat_it->get_left();
                    // ... and the beat goes on ...

可以通过将信息类型从bimap更改为包含struct的{{1}}来添加任何给定级别（例如，菜单项字符串）所需的任何其他信息。我可能需要的任何信息。

由于这是所有常数值，我可以“预先”完成所有艰苦工作并设计简单的查找功能 - O（1） - 以获得我需要的东西。

C ++中的分层枚举

6 个答案: