Distinct()与lambda?

时间:2009-08-19 13:50:48

标签: c# c#-3.0 lambda extension-methods

是的,所以我有一个可枚举的,并希望从中得到不同的值。

使用System.Linq,当然有一种名为Distinct的扩展方法。在简单的情况下,它可以不带参数使用,例如:

var distinctValues = myStringList.Distinct();

嗯,好,但是如果我有一个我需要指定相等性的可枚举对象,唯一可用的重载是:

var distinctValues = myCustomerList.Distinct(someEqualityComparer);

相等比较器参数必须是IEqualityComparer<T>的实例。当然,我可以做到这一点,但它有点冗长,而且很好,很明显。

我所期望的是一个需要lambda的重载,比如一个Func&lt; T,T,bool&gt;:

var distinctValues
    = myCustomerList.Distinct((c1, c2) => c1.CustomerId == c2.CustomerId);

任何人都知道是否存在某些此类扩展或某些等效的解决方法?或者我错过了什么?

或者,是否有一种方法可以指定内联的IEqualityComparer(embarass me)?

更新

我在一个MSDN论坛上找到了Anders Hejlsberg对post的回复。他说:

  

你要遇到的问题是当两个对象比较时   相等,他们必须具有相同的GetHashCode返回值(否则   Distinct内部使用的哈希表将无法正常运行)。   我们使用IEqualityComparer,因为它兼容包   将Equals和GetHashCode实现为单个接口。

我认为这是有道理的。

19 个答案:

答案 0 :(得分:951)

IEnumerable<Customer> filteredList = originalList
  .GroupBy(customer => customer.CustomerId)
  .Select(group => group.First());

答案 1 :(得分:457)

在我看来,您希望来自DistinctByMoreLINQ。然后你可以写:

var distinctValues = myCustomerList.DistinctBy(c => c.CustomerId);

这是DistinctBy的简化版本(没有无效检查,也没有指定自己的密钥比较器的选项):

public static IEnumerable<TSource> DistinctBy<TSource, TKey>
     (this IEnumerable<TSource> source, Func<TSource, TKey> keySelector)
{
    HashSet<TKey> knownKeys = new HashSet<TKey>();
    foreach (TSource element in source)
    {
        if (knownKeys.Add(keySelector(element)))
        {
            yield return element;
        }
    }
}

答案 2 :(得分:28)

总结一下。我想像我这样来到这里的大多数人都希望最简单的解决方案而不使用任何库并且最好的性能

(对我来说,按照方法接受的群体在性能方面是一种矫枉过正。)

这是一个使用 IEqualityComparer 接口的简单扩展方法,该接口也适用于空值。

<强>用法:

var filtered = taskList.DistinctBy(t => t.TaskExternalId).ToArray();

扩展程序代码

public static class LinqExtensions
{
    public static IEnumerable<T> DistinctBy<T, TKey>(this IEnumerable<T> items, Func<T, TKey> property)
    {
        GeneralPropertyComparer<T, TKey> comparer = new GeneralPropertyComparer<T,TKey>(property);
        return items.Distinct(comparer);
    }   
}
public class GeneralPropertyComparer<T,TKey> : IEqualityComparer<T>
{
    private Func<T, TKey> expr { get; set; }
    public GeneralPropertyComparer (Func<T, TKey> expr)
    {
        this.expr = expr;
    }
    public bool Equals(T left, T right)
    {
        var leftProp = expr.Invoke(left);
        var rightProp = expr.Invoke(right);
        if (leftProp == null && rightProp == null)
            return true;
        else if (leftProp == null ^ rightProp == null)
            return false;
        else
            return leftProp.Equals(rightProp);
    }
    public int GetHashCode(T obj)
    {
        var prop = expr.Invoke(obj);
        return (prop==null)? 0:prop.GetHashCode();
    }
}

答案 3 :(得分:19)

没有这样的扩展方法重载。我在过去发现这令人沮丧,因此我经常写一个助手类来处理这个问题。目标是将Func<T,T,bool>转换为IEqualityComparer<T,T>

实施例

public class EqualityFactory {
  private sealed class Impl<T> : IEqualityComparer<T,T> {
    private Func<T,T,bool> m_del;
    private IEqualityComparer<T> m_comp;
    public Impl(Func<T,T,bool> del) { 
      m_del = del;
      m_comp = EqualityComparer<T>.Default;
    }
    public bool Equals(T left, T right) {
      return m_del(left, right);
    } 
    public int GetHashCode(T value) {
      return m_comp.GetHashCode(value);
    }
  }
  public static IEqualityComparer<T,T> Create<T>(Func<T,T,bool> del) {
    return new Impl<T>(del);
  }
}

这允许您编写以下内容

var distinctValues = myCustomerList
  .Distinct(EqualityFactory.Create((c1, c2) => c1.CustomerId == c2.CustomerId));

答案 4 :(得分:16)

速记解决方案

myCustomerList.GroupBy(c => c.CustomerId, (key, c) => c.FirstOrDefault());

答案 5 :(得分:12)

这会做你想要的,但我不知道性能:

var distinctValues =
    from cust in myCustomerList
    group cust by cust.CustomerId
    into gcust
    select gcust.First();

至少它并不详细。

答案 6 :(得分:10)

这是一个简单的扩展方法,可以满足我的需要...

public static class EnumerableExtensions
{
    public static IEnumerable<TKey> Distinct<T, TKey>(this IEnumerable<T> source, Func<T, TKey> selector)
    {
        return source.GroupBy(selector).Select(x => x.Key);
    }
}

很遗憾他们没有将这样一个独特的方法烘焙到框架中,但嘿嘿。

答案 7 :(得分:4)

我用过的东西对我来说效果很好。

/// <summary>
/// A class to wrap the IEqualityComparer interface into matching functions for simple implementation
/// </summary>
/// <typeparam name="T">The type of object to be compared</typeparam>
public class MyIEqualityComparer<T> : IEqualityComparer<T>
{
    /// <summary>
    /// Create a new comparer based on the given Equals and GetHashCode methods
    /// </summary>
    /// <param name="equals">The method to compute equals of two T instances</param>
    /// <param name="getHashCode">The method to compute a hashcode for a T instance</param>
    public MyIEqualityComparer(Func<T, T, bool> equals, Func<T, int> getHashCode)
    {
        if (equals == null)
            throw new ArgumentNullException("equals", "Equals parameter is required for all MyIEqualityComparer instances");
        EqualsMethod = equals;
        GetHashCodeMethod = getHashCode;
    }
    /// <summary>
    /// Gets the method used to compute equals
    /// </summary>
    public Func<T, T, bool> EqualsMethod { get; private set; }
    /// <summary>
    /// Gets the method used to compute a hash code
    /// </summary>
    public Func<T, int> GetHashCodeMethod { get; private set; }

    bool IEqualityComparer<T>.Equals(T x, T y)
    {
        return EqualsMethod(x, y);
    }

    int IEqualityComparer<T>.GetHashCode(T obj)
    {
        if (GetHashCodeMethod == null)
            return obj.GetHashCode();
        return GetHashCodeMethod(obj);
    }
}

答案 8 :(得分:3)

我在这里看到的所有解决方案都依赖于选择已经具有可比性的领域。但是,如果需要以不同的方式进行比较,this solution here似乎通常会起作用,例如:

somedoubles.Distinct(new LambdaComparer<double>((x, y) => Math.Abs(x - y) < double.Epsilon)).Count()

答案 9 :(得分:2)

您可以使用 InlineComparer

public class InlineComparer<T> : IEqualityComparer<T>
{
    //private readonly Func<T, T, bool> equalsMethod;
    //private readonly Func<T, int> getHashCodeMethod;
    public Func<T, T, bool> EqualsMethod { get; private set; }
    public Func<T, int> GetHashCodeMethod { get; private set; }

    public InlineComparer(Func<T, T, bool> equals, Func<T, int> hashCode)
    {
        if (equals == null) throw new ArgumentNullException("equals", "Equals parameter is required for all InlineComparer instances");
        EqualsMethod = equals;
        GetHashCodeMethod = hashCode;
    }

    public bool Equals(T x, T y)
    {
        return EqualsMethod(x, y);
    }

    public int GetHashCode(T obj)
    {
        if (GetHashCodeMethod == null) return obj.GetHashCode();
        return GetHashCodeMethod(obj);
    }
}

使用示例

  var comparer = new InlineComparer<DetalleLog>((i1, i2) => i1.PeticionEV == i2.PeticionEV && i1.Etiqueta == i2.Etiqueta, i => i.PeticionEV.GetHashCode() + i.Etiqueta.GetHashCode());
  var peticionesEV = listaLogs.Distinct(comparer).ToList();
  Assert.IsNotNull(peticionesEV);
  Assert.AreNotEqual(0, peticionesEV.Count);

来源: https://stackoverflow.com/a/5969691/206730
Using IEqualityComparer for Union
Can I specify my explicit type comparator inline?

答案 10 :(得分:2)

采取另一种方式:

var distinctValues = myCustomerList.
Select(x => x._myCaustomerProperty).Distinct();

序列返回不同的元素通过属性'_myCaustomerProperty'比较它们。

答案 11 :(得分:1)

执行此操作的一个棘手方法是使用Aggregate()扩展名,使用字典作为累加器,并将键属性值作为键:

var customers = new List<Customer>();

var distincts = customers.Aggregate(new Dictionary<int, Customer>(), 
                                    (d, e) => { d[e.CustomerId] = e; return d; },
                                    d => d.Values);

GroupBy风格的解决方案正在使用ToLookup()

var distincts = customers.ToLookup(c => c.CustomerId).Select(g => g.First());

答案 12 :(得分:1)

您可以使用LambdaEqualityComparer:

var distinctValues
    = myCustomerList.Distinct(new LambdaEqualityComparer<OurType>((c1, c2) => c1.CustomerId == c2.CustomerId));


public class LambdaEqualityComparer<T> : IEqualityComparer<T>
    {
        public LambdaEqualityComparer(Func<T, T, bool> equalsFunction)
        {
            _equalsFunction = equalsFunction;
        }

        public bool Equals(T x, T y)
        {
            return _equalsFunction(x, y);
        }

        public int GetHashCode(T obj)
        {
            return obj.GetHashCode();
        }

        private readonly Func<T, T, bool> _equalsFunction;
    }

答案 13 :(得分:0)

如果Distinct()没有产生独特的结果,请尝试以下方法:

var filteredWC = tblWorkCenter.GroupBy(cc => cc.WCID_I).Select(grp => grp.First()).Select(cc => new Model.WorkCenter { WCID = cc.WCID_I }).OrderBy(cc => cc.WCID); 

ObservableCollection<Model.WorkCenter> WorkCenter = new ObservableCollection<Model.WorkCenter>(filteredWC);

答案 14 :(得分:0)

Microsoft System.Interactive package有一个Distinct版本,它带有一个键选择器lambda。这实际上与Jon Skeet的解决方案相同,但它可能有助于人们了解并查看库的其余部分。

答案 15 :(得分:0)

IEnumerable lambda扩展名:

public static class ListExtensions
{        
    public static IEnumerable<T> Distinct<T>(this IEnumerable<T> list, Func<T, int> hashCode)
    {
        Dictionary<int, T> hashCodeDic = new Dictionary<int, T>();

        list.ToList().ForEach(t => 
            {   
                var key = hashCode(t);
                if (!hashCodeDic.ContainsKey(key))
                    hashCodeDic.Add(key, t);
            });

        return hashCodeDic.Select(kvp => kvp.Value);
    }
}

用法:

class Employee
{
    public string Name { get; set; }
    public int EmployeeID { get; set; }
}

//Add 5 employees to List
List<Employee> lst = new List<Employee>();

Employee e = new Employee { Name = "Shantanu", EmployeeID = 123456 };
lst.Add(e);
lst.Add(e);

Employee e1 = new Employee { Name = "Adam Warren", EmployeeID = 823456 };
lst.Add(e1);
//Add a space in the Name
Employee e2 = new Employee { Name = "Adam  Warren", EmployeeID = 823456 };
lst.Add(e2);
//Name is different case
Employee e3 = new Employee { Name = "adam warren", EmployeeID = 823456 };
lst.Add(e3);            

//Distinct (without IEqalityComparer<T>) - Returns 4 employees
var lstDistinct1 = lst.Distinct();

//Lambda Extension - Return 2 employees
var lstDistinct = lst.Distinct(employee => employee.EmployeeID.GetHashCode() ^ employee.Name.ToUpper().Replace(" ", "").GetHashCode()); 

答案 16 :(得分:0)

我假设您有一个IEnumerable,并且在您的示例委托中,您希望c1和c2引用此列表中的两个元素吗?

我相信你可以通过自我加入实现这一目标 var distinctResults =来自myList中的c1                       在myList上加入c2

答案 17 :(得分:0)

以下是您可以这样做的方法:

public static class Extensions
{
    public static IEnumerable<T> MyDistinct<T, V>(this IEnumerable<T> query,
                                                    Func<T, V> f, 
                                                    Func<IGrouping<V,T>,T> h=null)
    {
        if (h==null) h=(x => x.First());
        return query.GroupBy(f).Select(h);
    }
}

此方法允许您通过指定一个参数(如.MyDistinct(d => d.Name))来使用它,但它也允许您将条件指定为第二个参数,如下所示:

var myQuery = (from x in _myObject select x).MyDistinct(d => d.Name,
        x => x.FirstOrDefault(y=>y.Name.Contains("1") || y.Name.Contains("2"))
        );

N.B。这也允许您指定其他功能,例如.LastOrDefault(...)

如果你想公开这个条件,你可以通过实现它来简化它:

public static IEnumerable<T> MyDistinct2<T, V>(this IEnumerable<T> query,
                                                Func<T, V> f,
                                                Func<T,bool> h=null
                                                )
{
    if (h == null) h = (y => true);
    return query.GroupBy(f).Select(x=>x.FirstOrDefault(h));
}

在这种情况下,查询将如下所示:

var myQuery2 = (from x in _myObject select x).MyDistinct2(d => d.Name,
                    y => y.Name.Contains("1") || y.Name.Contains("2")
                    );

N.B。此处,表达式更简单,但请注意.MyDistinct2隐式使用.FirstOrDefault(...)

注意:以上示例使用以下演示类

class MyObject
{
    public string Name;
    public string Code;
}

private MyObject[] _myObject = {
    new MyObject() { Name = "Test1", Code = "T"},
    new MyObject() { Name = "Test2", Code = "Q"},
    new MyObject() { Name = "Test2", Code = "T"},
    new MyObject() { Name = "Test5", Code = "Q"}
};

答案 18 :(得分:-1)

我发现这是最简单的解决方案。

type.convert(df, as.is = TRUE)