Question

我有一类这样的对象：

public class Individual
{
    public double[] Number { get; set; } = new double[2]{ 0.0, 0.0 };
}

我将这些类存储在字典列表中，并提供Individual.Number的值：

selection = List<Dictionary<int, Individual>>

现在，我必须计算Individual.Number的不同值的数量（在整个列表中）。到目前为止，我所做的是：

selection.Values.SelectMany(list => list.Number).Distinct().Count();

我想知道这是最快的方法吗？如何提高性能？

谢谢

Answer 1

内部Distinct()方法会创建一个新的Set<T>，而无需指定大小。

如果您对元素数量有模糊的了解，这可以防止大量分配（和内存移动）。

由于只需要Count（），因此可以直接包含它（Credits @TimSchmelter）。

    public static int OptimizedDistinctAndCount<TSource>(this IEnumerable<TSource> source, int numberOfElements) {
        if (source == null) throw Error.ArgumentNull("source");
        var set = new HashSet<TSource>(numberOfElements);
        foreach (TSource element in source) {
           set.Add(element);
        }
        return set.Count;
    }

然后您可以使用：

selection.Values.SelectMany(list => list.Number).OptimizedDistinctAndCount(123);

Answer 2

您对此有何看法？

public class Individual
{
  public double[] Numbers { get; set; }
  public Individual()
  {
    Numbers = new double[0];
  }
  public Individual(double[] values)
  {
    Numbers = values/*.ToArray() if a copy must be done*/;
  }
}

class Program
{
  static void Main()
  {
    // Populate data
    var selection = new List<Dictionary<int, Individual>>();
    var dico1 = new Dictionary<int, Individual>();
    var dico2 = new Dictionary<int, Individual>();
    selection.Add(dico1);
    selection.Add(dico2);
    dico1.Add(1, new Individual(new double[] { 1.2, 1.3, 4.0, 10, 40 }));
    dico1.Add(2, new Individual(new double[] { 1.2, 1.5, 4.0, 20, 40 }));
    dico2.Add(3, new Individual(new double[] { 1.7, 1.6, 5.0, 30, 60 }));
    // Count distinct
    var found = new List<double>();
    foreach ( var dico in selection )
      foreach ( var item in dico )
        foreach ( var value in item.Value.Numbers )
          if ( !found.Contains(value) )
            found.Add(value);
    // Must show 12
    Console.WriteLine("Distinct values of the data pool = " + found.Count);
    Console.ReadKey();
  }
}

这种方法消除了一些调用方法的时间。

进一步的优化将使用for循环而不是foreach，并且可能使用链表而不是List（速度更快，但需要更多的内存）。

计算数组中不同元素数量的最快方法

2 个答案: