Question

我正在开发一个用Java编写的项目，这需要我构建一个非常大的2-D稀疏数组。非常稀疏，如果这有所不同。无论如何：这个应用程序最关键的方面是时间上的效率（假设内存负载，虽然不能无限制地允许我使用标准的2-D阵列 - 关键范围是两个维度的数十亿）。

在阵列中的kajillion细胞中，将有数十万个细胞包含一个物体。我需要能够很快地修改单元格内容。

无论如何：为此目的，有谁知道一个特别好的图书馆？它必须是伯克利，LGPL或类似的许可证（没有GPL，因为该产品不能完全开源）。或者，如果只有一种非常简单的方法来制作自制的稀疏数组对象，那也没关系。

我正在考虑MTJ，但没有听到任何关于其质量的意见。

Answer 1

遵循测试Java Matrix Libraries的框架，提供了一个很好的列表！ https://lessthanoptimal.github.io/Java-Matrix-Benchmark/

经过测试的库：

* Colt
* Commons Math
* Efficient Java Matrix Library (EJML)
* Jama
* jblas
* JScience (Older benchmarks only)
* Matrix Toolkit Java (MTJ)
* OjAlgo
* Parallel Colt
* Universal Java Matrix Package (UJMP)

Answer 2

这似乎很简单。

您可以使用row * maxcolums + column作为索引来使用数据的二叉树。

要查找项目，您只需计算行* maxcolums +列和二进制搜索树寻找它，如果不存在，则可以返回null（它是О（log n），其中n是包含一个的单元格数对象）。

Answer 3

可能不是最快的运行时解决方案，但我能想出的最快的解决方案似乎有效。创建一个Index类并将其用作SortedMap的键，如：

    SortedMap<Index, Object> entries = new TreeMap<Index, Object>();
    entries.put(new Index(1, 4), "1-4");
    entries.put(new Index(5555555555l, 767777777777l), "5555555555l-767777777777l");
    System.out.println(entries.size());
    System.out.println(entries.get(new Index(1, 4)));
    System.out.println(entries.get(new Index(5555555555l, 767777777777l)));

我的Index类看起来像这样（在Eclipse代码生成器的帮助下）。

public static class Index implements Comparable<Index>
{
    private long x;
    private long y;

    public Index(long x, long y)
    {
        super();
        this.x = x;
        this.y = y;
    }

    public int compareTo(Index index)
    {
        long ix = index.x;
        if (ix == x)
        {
            long iy = index.y;
            if (iy == y)
            {
                return 0;
            }
            else if (iy < y)
            {
                return -1;
            }
            else
            {
                return 1;
            }
        }
        else if (ix < x)
        {
            return -1;
        }
        else
        {
            return 1;
        }
    }

    public int hashCode()
    {
        final int PRIME = 31;
        int result = 1;
        result = PRIME * result + (int) (x ^ (x >>> 32));
        result = PRIME * result + (int) (y ^ (y >>> 32));
        return result;
    }

    public boolean equals(Object obj)
    {
        if (this == obj)
            return true;
        if (obj == null)
            return false;
        if (getClass() != obj.getClass())
            return false;
        final Index other = (Index) obj;
        if (x != other.x)
            return false;
        if (y != other.y)
            return false;
        return true;
    }

    public long getX()
    {
        return x;
    }

    public long getY()
    {
        return y;
    }
}

Answer 4

您可以查看la4j（Linear Algebra for Java）库。它支持稀疏矩阵的CRS (Compressed Row Storage)以及CCS (Compressed Column Storage)内部表示。因此，这些是稀疏数据最有效，最快速的内部结构。

以下是在la4j中使用稀疏矩阵的简要示例：

Matrix a = new CRSMatrix(new double[][]{ // 'a' - CRS sparse matrix
   { 1.0, 0.0, 3.0 },
   { 0.0, 5.0, 0.0 },
   { 7.0, 0.0. 9.0 }
});

Matrix b = a.transpose(); // 'b' - CRS sparse matrix

Matrix c = b.multiply(a, Matrices.CCS_FACTORY); // 'c' = 'b' * 'a'; 
                                                // 'c' - CCS sparse matrix

Answer 5

您可以使用嵌套地图，但如果您需要对其进行矩阵演算可能不是最佳选择

 Map<Integer, Map<integer, Object>> matrix;

也许代替对象使用一些元组来获取实际数据，这样你就可以在提取后更轻松地使用它，例如：

class Tuple<T extends yourDataObject> {
  public final int x;
  public final int y;
  public final T object;
}

class Matrix {
  private final Map<Integer, Map<interger, Tupple>> data = new...;

 void add(int x, int y, Object object) {
     data.get(x).put(new Tupple(x,y,object);
 }
}


//etc

为简洁起见省略了

null检查等

Answer 6

HashMap摇滚。只需使用StringBuilder（而不是+或String.format）将索引（作为字符串）与分隔符（例如＆＃39; /＆＃39;）连接起来，并将其用作键。您无法获得更快，更高效的内存效率。稀疏矩阵是20世纪的洙。： - ）

Java中的稀疏矩阵/数组

7 个答案: