我有一个List<CustomPoint> points;
,其中包含近百万个对象。
从这个列表中我想得到恰好发生两次的对象列表。最快的方法是什么?我也会对非Linq选项感兴趣,因为我可能也必须在C ++中这样做。
public class CustomPoint
{
public double X { get; set; }
public double Y { get; set; }
public CustomPoint(double x, double y)
{
this.X = x;
this.Y = y;
}
}
public class PointComparer : IEqualityComparer<CustomPoint>
{
public bool Equals(CustomPoint x, CustomPoint y)
{
return ((x.X == y.X) && (y.Y == x.Y));
}
public int GetHashCode(CustomPoint obj)
{
int hash = 0;
hash ^= obj.X.GetHashCode();
hash ^= obj.Y.GetHashCode();
return hash;
}
}
根据this回答,我试过,
list.GroupBy(x => x).Where(x => x.Count() = 2).Select(x => x.Key).ToList();
但是这会在新列表中提供零对象。 有人可以指导我吗?
答案 0 :(得分:9)
您应该在类本身而不是PointComparer
中实现Equals和GetHashCode答案 1 :(得分:4)
要使代码正常工作,您需要将PointComparer
的实例作为第二个参数传递给GroupBy
。
答案 2 :(得分:3)
这种方法对我有用:
public class PointCount
{
public CustomPoint Point { get; set; }
public int Count { get; set; }
}
private static IEnumerable<CustomPoint> GetPointsByCount(Dictionary<int, PointCount> pointcount, int count)
{
return pointcount
.Where(p => p.Value.Count == count)
.Select(p => p.Value.Point);
}
private static Dictionary<int, PointCount> GetPointCount(List<CustomPoint> pointList)
{
var allPoints = new Dictionary<int, PointCount>();
foreach (var point in pointList)
{
int hash = point.GetHashCode();
if (allPoints.ContainsKey(hash))
{
allPoints[hash].Count++;
}
else
{
allPoints.Add(hash, new PointCount { Point = point, Count = 1 });
}
}
return allPoints;
}
这样称呼:
static void Main(string[] args)
{
List<CustomPoint> list1 = CreateCustomPointList();
var doubles = GetPointsByCount(GetPointCount(list1), 2);
Console.WriteLine("Doubles:");
foreach (var point in doubles)
{
Console.WriteLine("X: {0}, Y: {1}", point.X, point.Y);
}
}
private static List<CustomPoint> CreateCustomPointList()
{
var result = new List<CustomPoint>();
for (int i = 0; i < 5; i++)
{
for (int j = 0; j < 5; j++)
{
result.Add(new CustomPoint(i, j));
}
}
result.Add(new CustomPoint(1, 3));
result.Add(new CustomPoint(3, 3));
result.Add(new CustomPoint(0, 2));
return result;
}
CustomPoint
实施:
public class CustomPoint
{
public double X { get; set; }
public double Y { get; set; }
public CustomPoint(double x, double y)
{
this.X = x;
this.Y = y;
}
public override bool Equals(object obj)
{
var other = obj as CustomPoint;
if (other == null)
{
return base.Equals(obj);
}
return ((this.X == other.X) && (this.Y == other.Y));
}
public override int GetHashCode()
{
int hash = 23;
hash = hash * 31 + this.X.GetHashCode();
hash = hash * 31 + this.Y.GetHashCode();
return hash;
}
}
打印:
Doubles:
X: 0, Y: 2
X: 1, Y: 3
X: 3, Y: 3
正如您在GetPointCount()
中看到的,我为每个唯一CustomPoint
(通过哈希)创建了一个字典。然后我插入一个PointCount
对象,其中包含对CustomPoint
开始的Count
的引用,每次遇到相同的点时,Count
都会增加。
最后在GetPointsByCount
我会在CustomPoint
字典中返回PointCount.Count == count
,在您的情况下为2。
请注意我更新了GetHashCode()
方法,因为您的方法返回点(1,2)和(2,1)相同的方法。如果您确实需要,请随意恢复自己的哈希方法。您必须测试散列函数,因为很难将两个数字唯一地散列为一个。这取决于使用的数字范围,因此您应该实现适合您自己需要的哈希函数。