N Boost interval_set的组合

时间:2015-02-04 11:09:00

标签: c++ algorithm boost intervals boost-icl

我的服务在4个不同的地方停运。我将每个位置中断建模为Boost ICL interval_set。我想知道什么时候至少N个地点有活动中断。

因此,在this answer之后,我实现了一个组合算法,因此我可以通过interval_set十字路口在elemenets之间创建组合。

当这个过程结束时,我应该有一定数量的interval_set,每个都同时定义N个位置的中断,最后一步将加入它们以获得所需的完整图片。

问题是我正在调试代码,当打印每个交集的时间到来时,输出文本会变得疯狂(即使我使用gdb逐步调试),我无法看到它们,导致大量CPU使用。

我想我会以某种方式发送输出比我应该更多的内存,但我无法看到问题所在。

这是一个SSCCE:

#include <boost/icl/interval_set.hpp>
#include <algorithm>
#include <iostream>
#include <vector>


int main() {
    // Initializing data for test
    std::vector<boost::icl::interval_set<unsigned int> > outagesPerLocation;
    for(unsigned int j=0; j<4; j++){
        boost::icl::interval_set<unsigned int> outages;
        for(unsigned int i=0; i<5; i++){
            outages += boost::icl::discrete_interval<unsigned int>::closed(
                (i*10), ((i*10) + 5 - j));
        }
        std::cout << "[Location " << (j+1) << "] " << outages << std::endl;
        outagesPerLocation.push_back(outages);
    }

    // So now we have a vector of interval_sets, one per location. We will combine
    // them so we get an interval_set defined for those periods where at least
    // 2 locations have an outage (N)
    unsigned int simultaneusOutagesRequired = 2;  // (N)

    // Create a bool vector in order to filter permutations, and only get
    // the sorted permutations (which equals the combinations)
    std::vector<bool> auxVector(outagesPerLocation.size());
    std::fill(auxVector.begin() + simultaneusOutagesRequired, auxVector.end(), true);

    // Create a vector where combinations will be stored
    std::vector<boost::icl::interval_set<unsigned int> > combinations;

    // Get all the combinations of N elements
    unsigned int numCombinations = 0;
    do{
        bool firstElementSet = false;
        for(unsigned int i=0; i<auxVector.size(); i++){
            if(!auxVector[i]){
                if(!firstElementSet){
                    // First location, insert to combinations vector
                    combinations.push_back(outagesPerLocation[i]);
                    firstElementSet = true;
                }
                else{
                    // Intersect with the other locations
                    combinations[numCombinations] -= outagesPerLocation[i];
                }
            }
        }
        numCombinations++;
        std::cout << "[-INTERSEC-] " << combinations[numCombinations] << std::endl;  // The problem appears here
    }
    while(std::next_permutation(auxVector.begin(), auxVector.end()));

    // Get the union of the intersections and see the results
    boost::icl::interval_set<unsigned int> finalOutages;
    for(std::vector<boost::icl::interval_set<unsigned int> >::iterator
        it = combinations.begin(); it != combinations.end(); it++){
        finalOutages += *it;
    }

    std::cout << finalOutages << std::endl;
    return 0;
}

任何帮助?

2 个答案:

答案 0 :(得分:11)

作为I surmised,有一个&#34;高级&#34;接近这里。

提升ICL容器不仅仅是一对间隔开始/结束点和#34;的美化容器。它们旨在以一般优化的方式实现仅仅组合,搜索的业务。

所以不必。

如果你让图书馆做他们应该做的事情:

using TimePoint = unsigned;
using DownTimes = boost::icl::interval_set<TimePoint>;
using Interval  = DownTimes::interval_type;
using Records   = std::vector<DownTimes>;

使用功能域typedef邀请更高级别的方法。现在,让我们问一下假设的商业问题&#34;:

  

我们对每个位置停机记录的实际想做些什么?

好吧,我们基本上想要

  1. 计算所有可辨别的时段和
  2. 过滤那些标记至少为2的文件
  3. 最后,我们要展示&#34;合并&#34;留下的时间段。
  4. 好的,工程师:实施它!


    1. 嗯。清点。它能有多难?

        

      elegant优雅解决方案的关键是选择正确的数据结构

      using Tally     = unsigned; // or: bit mask representing affected locations?
      using DownMap   = boost::icl::interval_map<TimePoint, Tally>;
      

      现在它只是批量插入:

      // We will do a tally of affected locations per time slot
      DownMap tallied;
      for (auto& location : records)
          for (auto& incident : location)
              tallied.add({incident, 1u});
      
    2. 好的,让我们过滤一下。我们只需要适用于我们的DownMap的谓词,对吧

      // define threshold where at least 2 locations have an outage
      auto exceeds_threshold = [](DownMap::value_type const& slot) {
          return slot.second >= 2;
      };
      
    3. 合并时段!

      实际上。我们只是创建另一个DownTimes集。只是,这次不是每个地点。

      数据结构的选择再次获胜:

      // just printing the union of any criticals:
      DownTimes merged;
      for (auto&& slot : tallied | filtered(exceeds_threshold) | map_keys)
          merged.insert(slot);
      
    4. 报告!

      std::cout << "Criticals: " << merged << "\n";
      

      请注意,我们无处接近操纵数组索引,重叠或非重叠间隔,闭合或开放边界。或者,[eeeeek!]收集元素的强力排列。

      我们刚刚阐述了目标,让图书馆开展工作。

      完整演示

      <强> Live On Coliru

      #include <boost/icl/interval_set.hpp>
      #include <boost/icl/interval_map.hpp>
      #include <boost/range.hpp>
      #include <boost/range/algorithm.hpp>
      #include <boost/range/adaptors.hpp>
      #include <boost/range/numeric.hpp>
      #include <boost/range/irange.hpp>
      #include <algorithm>
      #include <iostream>
      #include <vector>
      
      using TimePoint = unsigned;
      using DownTimes = boost::icl::interval_set<TimePoint>;
      using Interval  = DownTimes::interval_type;
      using Records   = std::vector<DownTimes>;
      
      using Tally     = unsigned; // or: bit mask representing affected locations?
      using DownMap   = boost::icl::interval_map<TimePoint, Tally>;
      
      // Just for fun, removed the explicit loops from the generation too. Obviously,
      // this is bit gratuitous :)
      static DownTimes generate_downtime(int j) {
          return boost::accumulate(
                  boost::irange(0, 5),
                  DownTimes{},
                  [j](DownTimes accum, int i) { return accum + Interval::closed((i*10), ((i*10) + 5 - j)); }
              );
      }
      
      int main() {
          // Initializing data for test
          using namespace boost::adaptors;
          auto const records = boost::copy_range<Records>(boost::irange(0,4) | transformed(generate_downtime));
      
          for (auto location : records | indexed()) {
              std::cout << "Location " << (location.index()+1) << " " << location.value() << std::endl;
          }
      
          // We will do a tally of affected locations per time slot
          DownMap tallied;
          for (auto& location : records)
              for (auto& incident : location)
                  tallied.add({incident, 1u});
      
          // We will combine them so we get an interval_set defined for those periods
          // where at least 2 locations have an outage
          auto exceeds_threshold = [](DownMap::value_type const& slot) {
              return slot.second >= 2;
          };
      
          // just printing the union of any criticals:
          DownTimes merged;
          for (auto&& slot : tallied | filtered(exceeds_threshold) | map_keys)
              merged.insert(slot);
      
          std::cout << "Criticals: " << merged << "\n";
      }
      

      打印

      Location 1 {[0,5][10,15][20,25][30,35][40,45]}
      Location 2 {[0,4][10,14][20,24][30,34][40,44]}
      Location 3 {[0,3][10,13][20,23][30,33][40,43]}
      Location 4 {[0,2][10,12][20,22][30,32][40,42]}
      Criticals: {[0,4][10,14][20,24][30,34][40,44]}
      

答案 1 :(得分:3)

在置换循环结束时,你写道:

numCombinations++;
std::cout << "[-INTERSEC-] " << combinations[numCombinations] << std::endl;  // The problem appears here

我的调试器告诉我,在第一次迭代numCombinations 在增量之前是 0。但是增加它会使它超出combinations容器的范围(因为它只是一个元素,所以索引为0)。

您的意思是在使用后增加它吗?是否有任何特殊原因不使用

std::cout << "[-INTERSEC-] " << combinations.back() << "\n";

或者,对于c ++ 03

std::cout << "[-INTERSEC-] " << combinations[combinations.size()-1] << "\n";

甚至只是:

std::cout << "[-INTERSEC-] " << combinations.at(numCombinations) << "\n";

会抛出std::out_of_range


另一方面,我认为Boost ICL有 更有效的方法来获得你想要的答案。让我考虑一下这一点。如果我看到它会发布另一个答案。

  

更新:使用Boost ICL发布 other answer show casing highlevel编码