Question

我有一个C＃-Application，它将来自TextFile的数据存储在Dictionary-Object中。要存储的数据量可能相当大，因此插入条目需要花费大量时间。由于内部数组的大小调整存储了Dictionary的数据，因此在Dictionary中有许多项目会变得更糟。因此，我使用将要添加的项目数量初始化词典，但这对速度没有影响。

这是我的功能：

private Dictionary<IdPair, Edge> AddEdgesToExistingNodes(HashSet<NodeConnection> connections)
{
  Dictionary<IdPair, Edge> resultSet = new Dictionary<IdPair, Edge>(connections.Count);

  foreach (NodeConnection con in connections)
  {
    ...
    resultSet.Add(nodeIdPair, newEdge);
  }

  return resultSet;
}

在我的测试中，我插入了~300k项目。我用ANTS Performance Profiler检查了运行时间，发现当我用所需的大小初始化Dictionary时，resultSet.Add（...）的平均时间不会改变。它与我用新的Dictionary（）初始化Dictionary时相同; （每次添加平均约0.256毫秒）。这肯定是由字典中的数据量引起的（尽管我用所需的大小初始化它）。对于前20k项，每个项目的Add的平均时间为0.03 ms。

任何想法，如何让添加操作更快？

提前致谢，弗兰克

这是我的IdPair-Struct：

public struct IdPair
{
  public int id1;
  public int id2;

  public IdPair(int oneId, int anotherId)
  {
    if (oneId > anotherId)
    {
      id1 = anotherId;
      id2 = oneId;
    }
    else if (anotherId > oneId)
    {
      id1 = oneId;
      id2 = anotherId;
    }
    else
      throw new ArgumentException("The two Ids of the IdPair can't have the same value.");
  }
}

Answer 1

由于您有结构，因此您将获得Equals（）和GetHashCode（）的默认实现。正如其他人所指出的那样，这不是很有效，因为它使用反射，但我不认为反射是问题。

我的猜测是你的哈希码被默认的GetHashCode（）不均匀地分配，这可能发生，例如，如果默认实现返回所有成员的简单XOR（在这种情况下哈希（a，b）== hash（b，a））。我找不到有关如何实现ValueType.GetHashCode（）的任何文档，但尝试添加

public override int GetHashCode() {
    return oneId << 16 | (anotherId & 0xffff);
}

可能会更好。

Answer 2

IdPair是struct，您尚未覆盖Equals或GetHashCode。这意味着将使用这些方法的默认实现。

对于值类型，Equals和GetHashCode的默认实现使用反射，这可能会导致性能不佳。尝试提供自己的方法实现，看看是否有帮助。

我的建议实施，可能不是您需要/想要的：

public struct IdPair : IEquatable<IdPair>
{
    // ...

    public override bool Equals(object obj)
    {
        if (obj is IdPair)
            return Equals((IdPair)obj);

        return false;
    }

    public bool Equals(IdPair other)
    {
        return id1.Equals(other.id1)
            && id2.Equals(other.id2);
    }

    public override int GetHashCode()
    {
        unchecked
        {
            int hash = 269;
            hash = (hash * 19) + id1.GetHashCode();
            hash = (hash * 19) + id2.GetHashCode();
            return hash;
        }
    }
}

Dictionary.Add的高运行时添加了大量的项目

2 个答案: