Question

我有一个复杂的问题，我不知道我是否可以正确描述它。

我有一个类的二维对象数组。目前我的算法只在这个二维数组上运行，但只占用了该数组的某些位置。（差不多40％）

它适用于小数据集但如果我有大数据集（该2d数组的大量元素，例如10000），则程序变为内存穷举。因为我有嵌套循环，使得10000 * 10000 = 100000000次迭代。

我可以用Hashtable或其他一些数据结构替换2 d数组吗？我的主要目标是仅通过更改数据结构来减少迭代次数。

请原谅我没有正确解释。我正在使用C＃

进行开发

Answer 1

听起来你拥有的数据结构是一个稀疏矩阵，我将指向Are there any storage optimized Sparse Matrix implementations in C#?

Answer 2

您可以从数组坐标为字典创建键。类似的东西：

int key = x * 46000 + y;

（这自然适用于类似于高达46000x46000的数组的坐标，这与int中的拟合大小有关。如果需要表示更大的数组，则使用long值作为关键。）

使用密钥，您可以在Dictionary<int, YourClass>中存储和检索对象。从字典中存储和检索值非常快，并不比使用数组慢得多。

您可以迭代字典中的项目，但不会以可预测的顺序获取它们，即与循环数组的x和y坐标不同。

Answer 3

如果您需要高性能，可以使用自己的数据结构。如果对象只能包含在一个容器中而不能移动到其他容器，则可以执行自定义哈希集，如数据结构。

您可以在课程中添加X，Y和Next字段。您创建一个存储在作为哈希表的数组中的对象的单链表。这可以非常快。

我是从头开始写的，可能有bug。清除和rehash没有实现，这只是一个演示。所有操作的复杂度平均为O（1）。

为了便于在跳过空节点的所有节点上枚举，有一个双向链表。从双向链表中插入和删除的复杂性为O（1），您将能够枚举跳过未使用节点的所有节点，因此枚举所有节点的复杂度为O（n），其中n是节点数，而不是这个稀疏矩阵的“虚拟”大小。

使用双向链接列表，您可以按插入时的顺序枚举项目。订单与X和Y坐标无关。

public class Node
{
    internal NodeTable pContainer;
    internal Node pTableNext;
    internal int pX;
    internal int pY;
    internal Node pLinkedListPrev;
    internal Node pLinkedListNext;
}

public class NodeTable :
    IEnumerable<Node>
{
    private Node[] pTable;
    private Node pLinkedListFirst;
    private Node pLinkedListLast;

    // Capacity must be a prime number great enough as much items you want to store.
    // You can make this dynamic too but need some more work (rehashing and prime number computation).
    public NodeTable(int capacity)
    {
        this.pTable = new Node[capacity];
    }

    public int GetHashCode(int x, int y)
    {
        return (x + y * 104729); // Must be a prime number
    }

    public Node Get(int x, int y)
    {
        int bucket = (GetHashCode(x, y) & 0x7FFFFFFF) % this.pTable.Length;
        for (Node current = this.pTable[bucket]; current != null; current = current.pTableNext)
        {
            if (current.pX == x && current.pY == y)
                return current;
        }
        return null;
    }

    public IEnumerator<Node> GetEnumerator()
    {
        // Replace yield with a custom struct Enumerator to optimize performances.
        for (Node node = this.pLinkedListFirst, next; node != null; node = next)
        {
            next = node.pLinkedListNext;
            yield return node;
        }
    }

    IEnumerator IEnumerable.GetEnumerator()
    {
        return this.GetEnumerator();
    }

    public bool Set(int x, int y, Node node)
    {
        if (node == null || node.pContainer != null)
        {
            int bucket = (GetHashCode(x, y) & 0x7FFFFFFF) % this.pTable.Length;

            for (Node current = this.pTable[bucket], prev = null; current != null; current = current.pTableNext)
            {
                if (current.pX == x && current.pY == y)
                {
                    this.fRemoveFromLinkedList(current);

                    if (node == null)
                    {
                        // Remove from table linked list

                        if (prev != null)
                            prev.pTableNext = current.pTableNext;
                        else
                            this.pTable[bucket] = current.pTableNext;
                        current.pTableNext = null;
                    }
                    else
                    {
                        // Replace old node from table linked list

                        node.pTableNext = current.pTableNext;
                        current.pTableNext = null;

                        if (prev != null)
                            prev.pTableNext = node;
                        else
                            this.pTable[bucket] = node;

                        node.pContainer = this;
                        node.pX = x;
                        node.pY = y;

                        this.fAddToLinkedList(node);
                    }

                    return true;
                }
                prev = current;
            }

            // New node.

            node.pContainer = this;
            node.pX = x;
            node.pY = y;

            // Add to table linked list

            node.pTableNext = this.pTable[bucket];
            this.pTable[bucket] = node;

            // Add to global linked list

            this.fAddToLinkedList(node);

            return true;
        }
        return false;
    }

    private void fRemoveFromLinkedList(Node node)
    {
        Node prev = node.pLinkedListPrev;
        Node next = node.pLinkedListNext;

        if (prev != null)
            prev.pLinkedListNext = next;
        else
            this.pLinkedListFirst = next;

        if (next != null)
            next.pLinkedListPrev = prev;
        else
            this.pLinkedListLast = prev;

        node.pLinkedListPrev = null;
        node.pLinkedListNext = null;
    }

    private void fAddToLinkedList(Node node)
    {
        node.pLinkedListPrev = this.pLinkedListLast;
        this.pLinkedListLast = node;
        if (this.pLinkedListFirst == null)
            this.pLinkedListFirst = node;
    }
}

Answer 4

数组提供多种功能：

一种将数据组织为元素列表的方法
按索引号（第1，第2，第3等）访问数据元素的方法

但是一个共同的缺点（取决于语言和运行时）是数组通常作为稀疏数据结构工作得很差 - 如果你不需要所有的数组元素，那么你最终会浪费内存空间。 / p>

所以，是的，哈希表通常可以节省数组上的空间。

但是您问My main aim is to reduce the number of iterations only by changing the data structure.为了回答这个问题，我们需要了解您的算法的更多信息 - 您在程序的每个循环中所做的事情。

例如，有许多方法可以对数组或矩阵进行排序。不同的排序算法使用不同的迭代次数。

用其他数据结构替换二维数组

4 个答案: