Question

我有以下问题：

我有n个hashmaps（我们称之为A.1，A.2，...，An），它们从std :: string映射到boost :: variant，我想将它们合并到一个hashmap中（让我们调用那一个B）如下：

如果A.1，A.2，...... A.n包含与密钥K相同的值，则B应包含与密钥K相同的值。
如果存在某个密钥K，其中一个映射A.1，A.2，... An中不存在，那么B应该包含从密钥K到值boost :: blank的映射。
如果A.1，A.2，......中的密钥K存在某个值，则其与其他值不同，则B应包含从密钥K到值boost :: blank的映射。

我必须这么做，我知道它会成为一个瓶颈。实现这一目标的最有效方法是什么？是否有任何库支持合并这样的哈希映射？

编辑：如果其他数据结构更合适，请告诉我。但是，我确实需要O（1）查找

Answer 1

如果我正确阅读此内容，您正在寻找O(N log N)解决方案。如果没有C ++经验来编写正确的实现，那么你想要做的就是这些：

1) Push all of the elements from every map into a `Set`   `O(N) + overhead of checking existence O(1)`  
   A)  define the elements of the set by a hash value generated by `Key + Value`   
      a)  that is `A:hello`  creates a different value than `B:hello`  
2) Perform a merge sort `O(N log N)` based on the values  
3) Start at the beginning of your `sorted set` and iterate over the values.  
   A) If a value is found multiple times this satisfies your `3` step  
   B) Your `2` step can be satisifed in `O(1)` time by looking up the key

Answer 2

这是我的算法（只是迭代所有元素）：

std::unordered_map<string, variant> result
选择包含大多数条目的地图O（n）。
遍历最大地图的每个元素 - ＆gt; O（M）
1. result[current_key] = current_value
2. 对于其他地图 - ＆gt;为O（n-1）
  1. 查找键 - ＆gt; O（1）
  2. 如果密钥缺席 - > result[current_key] = blank。得到最大地图中的下一个项目。
  3. [key present]如果current_value != map[key] - ＆gt; result[current_key] = blank。得到最大地图中的下一个项目。
  4. [关键礼物]到了最大地图的下一个项目。

这就是全部。你有O（n）+ O（m * n）等于O（m * n）。由于@ Woot4Moo算法的步骤（2）需要O（m * n）* log（m * n），因此@ Woot4Moo的算法似乎更少了。

我不确定是否可以以简单的方式使其比O（m * n）更快。我想你需要一些动态处理才能更快地完成这项工作。但！！！这种缓存不是很安全。所以，首先考虑简单的算法。也许它不会给您的应用带来瓶颈。

算法：合并std :: unordered_map

2 个答案: