从列表项建立多子树

时间:2013-08-21 08:12:08

标签: performance algorithm tree graph-algorithm

我有一个像这样的输入数据结构:

{
 { {x1,y1,z1,p1}, {x1,y2,z1,p1}, {x1,y3,z1,p1}, {x1,y4,z1,p1} },
 { {x1,y1,z2,p2}, {x1,y2,z2,p2}, {x1,y3,z2,p2}, {x1,y4,z2,p2} },
 { {x1,y1,z3,p3}, {x1,y2,z3,p3}, {x1,y3,z3,p3}, {x1,y4,z3,p3} }
}

为了完全理解该问题,每个项目都是一个布尔表达式(类型为ITree),例如, nn< 42。

我最终想要的是显示每个位置所在位置的树木数量:

热门树:

                root
                  |
                 x1
            /     |    \
           z1    z2    z3
            |     |    |
           a(0)  a(4) a(8)

树a:

               root(offset) - offset = o
            /     /     \      \
           y1    y2     y3     y4
           |      |      |      |
         b(o)  b(o+1)  b(o+2) b(o+3)

树b:

               root(offset) - offset = o
            /     |     \
           p1    p2     p3
           |      |      |
           o     o+1    o+2

因此,如果我有一个列表,其中包含{x1,y2,z1,p1}将评估为true的值,我可以轻易地看到它在单元格1中(实际上在0,1中),并且包含的​​列表包含{x1,y2,z1,p2}它不在任何单元格中。

我已经构建了一个功能实现,但它很慢:

public MultiIfTreeList Compile(List<List<ITree>> input, out List<MultiIfTreeList> all, out List<int> orderOfLeafs ) {
  //{ { {x,y} }, { {z} } } -> {0:{x.ToString:x, y.ToString:y}, 1:{z.ToString:z}}
  List<Tuple<int, Dictionary<string, ITree>>> andList = ConvertToDictionaryAndFlatten(input);

  orderOfLeafs = new List<int>();
  all = new List<MultiIfTreeList>();
  return BuildIfTree(andList, orderOfLeafs, all);
}

  private MultiIfTreeList BuildIfTree(List<Tuple<int, Dictionary<string, ITree>>> andList, List<int> orderOfLeafs, List<MultiIfTreeList> all)
  {
     if (andList.Count == 0)
        return null;
     var children = new List<MultiIfTree>();

     while (andList.Count > 0)
     {
        //count number of occurances of each statement, ie x1=5 and x2=3 and find the highest one
        Dictionary<string, int> counts = new Dictionary<string, int>();
        foreach (var exp1 in andList.SelectMany(exp => exp.Item2.Keys))
           if (!counts.ContainsKey(exp1))
              counts[exp1] = 1;
           else
              counts[exp1]++;

        var maxcount = counts.Max(x => x.Value);

        if (maxcount == 1) //OPTIMIZATION: then all are different and we can just do them one at a time
        {
           foreach (var lst in andList)
           {
              var item = lst.Item2.First();
              lst.Item2.Remove(item.Key);
              var idx = orderOfLeafs.Count;
              children.Add(
                 new MultiIfTree
                 {
                    Value = item.Key,
                    Ast = item.Value,
                    LeafsThis = 0,
                    Children = BuildIfTree(new List<Tuple<int, Dictionary<string, ITree>>> { lst }, orderOfLeafs, all),
                    ExpireCountWithChildren = orderOfLeafs.Count - idx,
                 });
           }
           andList.Clear();
        }
        else
        {
           //Make lists of where each statement can be found
           foreach (var kvp in counts.Where(x => x.Value == maxcount))
           {
              var max = kvp.Key;
              var expireindex = expire.Count;
              ITree exp = null;
              var listWithMax = new List<Tuple<int, Dictionary<string, ITree>>>();
              var listWithoutMax = new List<Tuple<int, Dictionary<string, ITree>>>();
              foreach (var lst in andList)
              {
                 var copy = new Dictionary<string, ITree>(lst.Item2);
                 var item = new Tuple<int, Dictionary<string, ITree>>(lst.Item1, copy);
                 if (copy.ContainsKey(max))
                 {
                    exp = copy[max];
                    copy.Remove(max);
                    if (copy.Count == 0)
                       expire.Add(lst.Item1);
                    else
                       listWithMax.Add(item);
                 }
                 else
                    listWithoutMax.Add(item);
              }
              if (exp != null)
                 children.Add(
                    new MultiIfTree
                    {
                       Value = max,
                       Ast = exp,
                       LeafsThis = orderOfLeafs.Count - idx,
                       Children = BuildIfTree(listWithMax, expire, all),
                       LeafCountWithChildren = orderOfLeafs.Count - idx,
                    });
              andList = listWithoutMax;
           }
        }
     }
     var tree = new MultiIfTreeList(children);
     all.Add(tree);
     return tree;
  }

在一个包含25008个列表的输入中,每个列表包含4个表达式,这在我的机器上大约需要800毫秒。

编辑1: 为了实际获得顶部树a和b,它只是一个简单的哈希它们,在我的特殊情况下,那些25008 * 4最终成为24000种树,其被砍伐成760棵独特的树。这部分只花了21毫秒

0 个答案:

没有答案