Question

我有一个LINQ语句，它从集合中提取前N个记录ID，然后从另一个查询中提取所有具有这些ID的记录。它感觉非常笨重和低效，我想知道是否有更简洁的LINQy方式来获得相同的结果

var records = cache.Select(rec => rec.Id).Distinct().Take(n);

var results = cache.Where(rec => records.Contains(rec.Id));

仅供参考 - 将有多个具有相同ID的记录，这就是为什么有Distinct（）以及为什么我不能首先使用简单的Take（）。

谢谢！

Answer 1

这样的事情怎么样？

var results = cache.GroupBy(rec => rec.Id, rec => rec)
                   .Take(n)
                   .SelectMany(rec => rec);

Answer 2

你做了同样的事情，但在一行中使用Join（）而不是Contains（）：

var results = cache
    .Select(rec => rec.Id)
    .Distinct()
    .Take(n)
    .ToList()
    .Join(cache, rec => rec, record => record.Id, (rec, record) => record);

Answer 3

是的，不幸的是，LINQ本身并不支持让用户选择成员来获取不同的记录。所以我建议为它创建自己的扩展方法：

/// <summary>
    /// Returns a list with the ability to specify key(s) to compare uniqueness on
    /// </summary>
    /// <typeparam name="T">Source type</typeparam>
    /// <param name="source">Source</param>
    /// <param name="keyPredicate">Predicate with key(s) to perform comparison on</param>
    /// <returns></returns>
    public static IEnumerable<T> Distinct<T>(this IEnumerable<T> source,
                                             Func<T, object> keyPredicate)
    {
        return source.Distinct(new GenericComparer<T>(keyPredicate));
    }

然后创建一个通用比较器，您会注意到它非常通用。

   public class GenericComparer<T> : IEqualityComparer<T>
    {
        private Func<T, object> _uniqueCheckerMethod;

        public GenericComparer(Func<T, object> keyPredicate)
        {
            _uniqueCheckerMethod = keyPredicate;
        }

        #region IEqualityComparer<T> Members

        bool IEqualityComparer<T>.Equals(T x, T y)
        {
            return _uniqueCheckerMethod(x).Equals(_uniqueCheckerMethod(y));
        }

        int IEqualityComparer<T>.GetHashCode(T obj)
        {
            return _uniqueCheckerMethod(obj).GetHashCode();
        }

        #endregion
    }

现在只需链接你的LINQ语句： var records = cache.Select（rec =＆gt; rec.Id）.Distinct（）。Take（n）;

var results = cache.Distinct(rec => rec.Id).Take(n));

HTH

Answer 4

我能想到在SQL中执行此操作的唯一方法是使用子查询，因此可能会有两个LINQ查询... 它“感觉”效率低下......是吗？也许你担心一些不值得担心的事情。你可以通过连接将它变成一行，但是这是否更清晰/更好/更有效是一个不同的问题。

编辑：Aaronaught的扩展方法答案可以这样工作：

    public static IEnumerable<T> TakeByDistinctKey<T, TKey>(this IEnumerable<T> source, Func<T, TKey> keyFunc, int numKeys) {
    if(keyFunc == null) {
        throw new ArgumentNullException("keyFunc");
    }

    List<TKey> keys = new List<TKey>();
    foreach(T item in source) {
        TKey key = keyFunc(item);
        if(keys.Contains(key)) {
            // one if the first n keys, yield
            yield return item;
        } else if(keys.Count < numKeys) {
            // new key, but still one of the first n seen, yield
            keys.Add(key);
            yield return item;
        }
        // have enough distinct keys, just keep going to return all of the items with those keys
    }
}

然而，GroupBy / SelectMany看起来是最好的。我会选择那个。

Answer 5

没有内置的“Linqy”方式（你可以分组，但效率很低），但这并不意味着你不能按自己的方式行事：

public static IEnumerable<T> TakeDistinctByKey<T, TKey>(
    this IEnumerable<T> source,
    Func<T, TKey> keyFunc,
    int count)
{
    if (keyFunc == null)
        throw new ArgumentNullException("keyFunc");
    if (count <= 0)
        yield break;

    int currentCount = 0;
    TKey lastKey = default(TKey);
    bool isFirst = true;
    foreach (T item in source)
    {
        yield return item;
        TKey key = keyFunc(item);
        if (!isFirst && (key != lastKey))
            currentCount++;
        if (currentCount > count)
            yield break;
        isFirst = false;
        lastKey = key;
    }
}

然后你可以用这个来调用它：

var items = cache.TakeDistinctByKey(rec => rec.Id, 20);

如果您有复合键或类似的东西，您可以轻松扩展上述方法，以IEqualityComparer<TKey>作为参数。

另请注意，这取决于按键排序的元素。如果不是，您可以更改上面的算法以使用HashSet<TKey>而不是直接计数和最后项目比较，或者用它来调用它：

var items = cache.OrderBy(rec => rec.Id).TakeDistinctByKey(rec => rec.Id, 20);

编辑 - 我还想指出在SQL中我会使用ROW_NUMBER查询或递归CTE，具体取决于性能要求 - 不同的+连接不最有效的方法。如果您的缓存按排序顺序（或者如果您可以将其更改为按排序顺序），则上述方法在内存和执行时间方面将是最便宜的。

使用LINQ从另一个LINQ集合中获取结果

5 个答案: