Question

我正在尝试实现一个包含n维数据的通用n维树。通过n维数据，我的意思是在我的情况下具有6-7坐标的数据点。这是树节点（复合数据类型）和树类：

#data = data points (i.e. [x,y,z,k,m,n])
#hypercube = set of tuples; coordinates [(x0,x1),(y0,y1)...]
class _Node:
    def __init__(self, data, hypercube):
        self.data = data
        self.hypercube = hypercube

class _nTree:
    def __init__(self, hypercube, depth = 0):
        self.node = []
        self.children = []
        self.depth = depth
        self.hypercube = hypercube

    def __insert__(self, data):
        if not self.node:
            self.node = _Node(data, self.hypercube)
            if (len(self.node.data) != 1):
                self.__split__()

在我的情况下，每个孩子都将包含其父节点中包含的数据 - 这是检查len（self.node.data）是否不等于1的原因。如果我们只有1超立方体中包含的数据点，然后我们停止并且我们有一个叶子节点。如果我们有多个，我们进一步拆分。仅当数据点位于超立方体坐标定义的边界内时，才会将数据点放置在超立方体中。

例如，假设您有一个坐标为[（0,1），（0,1）]的2D平面 - 我们的根节点。我们想用数据点[（0.5,0.1），（0.2,0.3）]填充它。由于我们有两个数据点，我们将平面分成2 ^ n个新的超立方体（在这种情况下为正方形），其中n是维数 - 在这种情况下为2。从1x1的根平方，我们得到4个较小的坐标平方[[（0,0.5），（0,0.5）]，[（0.5,1），（0.5,1）]，[（0.5,1），（ 0,0.5）]，[（0,0.5），（0.5,1）] - 基本上是根节点的子节点。这是一个四叉树的例子，可以在这里看到：https://en.wikipedia.org/wiki/Quadtree

我正在尝试做同样的事情，但有多个维度。

现在我试图解释我想要做什么，我的问题是：

超立方体变量包含当前节点的坐标。如何以正确生成坐标的方式实现我的分割功能？例如，如果我有6个维度，我必须为每个节点生成64个坐标（2 ^ n; n =维数）。作为抬头，它不是k-D树。

编辑：我想我应该发布我当前的拆分功能：

def __split__(self):
    n_of_children = 2**(len(self.node.hypercube[0]))
    vector = self.__get_vector__() #returns the coordinates of all 64 hypercubes/trees
    self.children = [_nTree(vector, self.depth+1) for i in range(n_of_children)[ 
    self.__insert_children__(self.data)

我将每个子节点声明为树结构，然后调用insert_children来决定每个数据点进入哪个子节点。如果孩子中有多个数据点，我们会重复分割和插入的整个过程。

Answer 1

我曾经用Java编写了一个k维四叉树，这里是代码：

NodeKD(double[] min, double[] max, int maxDepth, NodeKD parent) {
    this.min = min;
    this.max = max;
    this.center = new double[min.length];
    for (int i = 0; i < min.length; i++) {
        this.center[i] = (max[i]+min[i])/2;
    }
    this.maxDepth = maxDepth == -1 ? 4 : maxDepth;
    this.children = new ArrayList<>();
    qA = new NodeKD[1 << min.length];
    this.parent = parent;
}

private void subdivide() {
    int dim = min.length;
    double[] min = new double[dim];
    double[] max = new double[dim];
    for (int i = 0; i < qA.length; i++) {
        long mask = 1L;
        for (int j = 0; j < dim; j++) {
            if ((j & mask) == 0) {
                min[j] = this.min[j];
                max[j] = this.center[j];
            } else {
                min[j] = this.center[j];
                max[j] = this.max[j];
            }
            mask <<= 1;
        }
        qA[i] = new NodeKD(min, max, maxDepth-1, this);
    }
}

然而，据我所知，四叉树（2D）和八叉树（3D）对于更高的尺寸效率不高。根据你想要做的事情（范围查询，最近邻查询，简单查找，大量插入......），我会选择不同的结构。 KD-Trees非常简单，可以插入/删除。 R-Trees（R + tree，R * tree，X-tree）非常适合范围查询和最近邻居查询。然而，原始的R-Tree对于稍后修改添加/删除数据非常不利。

我个人最喜欢的是我自己的PH-Tree。它类似于k维四叉树，但有一些差异：

它本质上是一个“trie”或“critbit”树。这意味着只要一个值是＆＃39; 0＆＃39;它就会查看值的位表示。另一个是＆＃39; 1＆＃39;。由于我在位级操作，节点内的导航和寻址非常有效，因为我可以简单地操作k位字符串（对于k维）来迭代本地子节点并检查它们对查询的适用性。这避免了具有高维度的可伸缩性的许多问题
它使用前缀共享来减少内存需求（每个节点只存储本地值彼此不同的位）。
由于静态性质（根据位分割），任何修改都不会影响两个以上的节点，因此不需要重新平衡。
虽然没有重新平衡，但树的深度限制为64（假设64位值），因此它不会严重退化。

可以找到更多详细信息here和here。缺点是当前open-source version仅在Java中（不是python）并且非常复杂。我在途中有一个相当改进的版本（更简单的代码），但我可能需要一段时间才能发布它。

Answer 2

这里是我为在超立方体空间中运行cppn查询而开发的类似实现。在回顾四叉树的算法时，我还注意到该细分应该获得2 ^ n个坐标，并且还发现该坐标由维度长度之和的排列表示，并且（+或- ）的那个值/ 2。由于它的值随着我们的置换而变化，只能处于两种状态，因此我找到了一种非常优雅的解决方案，可以在这里找到二进制字符串的置换：https://codereview.stackexchange.com/questions/24690/print-all-binary-strings-of-length-n使用它可以循环2 ^ n创建一个新点并使用该索引处的位串排列以确定该维度值的总和中第二个数字的符号（1 =正和0 =负）。

下面，我首先包括了Java实现，因为我认为它更易读，并且位串置换更加优雅，下面是python实现，但不使用int-> bitstring生成置换。两者都会将任意维n坐标细分为2 ^ n个子树

public class FractalTree {
    public double[] coord;
    public double width;
    public double weight;
    public double lvl;
    public String[] signs;
    public FractalTree[] children;
    public FractalTree(double[] c, double width, int lvl) {
        this.coord = c;
        this.width = width;
        this.lvl = lvl;
        this.children = new FractalTree[(int)Math.pow(2.0, (double)c.length)];
        this.permute_signs(this.coord.length);
    }
    public void subdivide_into_children() {
        for(int idx = 0; idx < this.children.length; idx++) {
            String sign_pattern = this.signs[idx];
            double[]  new_coord = new double[this.coord.length];
            for(int idx_2 = 0; idx_2 < this.coord.length; idx_2++) {
                char sign = sign_pattern.charAt(idx_2);
                if(sign == '1') {
                    new_coord[idx_2] = coord[idx_2] + this.width/2.0;
                } else {
                    new_coord[idx_2] = coord[idx_2] - this.width/2.0;
                }
            }
        }
    }

    public void permute_signs(int coord_len) {
        String str_len = "%" + Integer.toString(coord_len) + "s";
        for(long ix = 0; ix < this.children.length; ix++) {
            this.signs[(int)ix] = String.format(str_len, Long.toBinaryString(ix)).replace(' ', '0');
        }
    }

}

class nDimensionTree:

    def __init__(self, in_coord, width, level):
        self.w = 0.0
        self.coord = in_coord
        self.width = width
        self.lvl = level
        self.num_children = 2**len(self.coord)
        self.cs = [None] * self.num_children
        self.signs = self.set_signs()
        print(self.signs)
    def set_signs(self):
        return list(itertools.product([1,-1], repeat=len(self.coord)))

    def divide_childrens(self):
        for x in range(self.num_children):
            new_coord = []
            for y in range(len(self.coord)):
                new_coord.append(self.coord[y] + (self.width/(2*self.signs[x])))
            newby = nDimensionTree(new_coord, self.width/2, self.lvl+1)
        self.cs.append(newby)

n-D树 - 计算超立方体的坐标

2 个答案: