Linq将2个KeyValues查找组合成一个具有相同键的一个

时间:2017-11-01 04:19:31

标签: c# linq

我正在尝试将两个查找合并为一个,如下例所示

enter image description here

目前,我正在使用linq将第一个表放入列表,然后在第二个表上循环并添加新行(如果不存在)或者如果ID存在则更新Table2Value,但由于两个大量数据,其性能太慢使用

只使用linq实现此目的吗?

当前代码:

        IQueryable<KeyValuePair<string, KeyValuePair<int, string>>> englishResources = _localizationService.GetAllResourceValues(1).AsQueryable();
        IQueryable<KeyValuePair<string, KeyValuePair<int, string>>> arabicResources = _localizationService.GetAllResourceValues(2).AsQueryable();

        List<LanguageResourceModel> languagesResources = englishResources.Select(c => new LanguageResourceModel()
        {
            Name = c.Key,
            EnglishValue = c.Value.Value,
        }).ToList();

        foreach (var item in arabicResources)
        {
            if (languagesResources.Any(c => c.Name.ToLower() == item.Key.ToLower()))
            {
                languagesResources.Where(c => c.Name.ToLower() == item.Key.ToLower()).FirstOrDefault().ArabicValue = item.Value.Value;
            }
            else
            {
                languagesResources.Add(new LanguageResourceModel
                {
                    Name = item.Key,
                    ArabicValue = item.Value.Value,
                });
            }
        }

2 个答案:

答案 0 :(得分:1)

首先,我假设您正在使用Nop.Services.Localization?如果是这样,在匹配密钥时调用ToLower()是过度的(because the keys are already in lowercase)并且可能会降低性能,因为我们无法利用字典的有效性。

其次,您尝试做的是 Full Outer Join 。虽然您可以在LINQ中执行此操作,但对于大型数据集来说远非高效

由于您将进行大量的密钥匹配,因此您不需要比简单的词典更进一步。它针对密钥查找进行了优化。

您可以使用完整的test code in Fiddle

// No need for Queryables. The function already returns Dictionary which is already an IEnumerable.
var englishResources = GetAllResourceValues(1);
var arabicResources = GetAllResourceValues(2);

// Start with all the english words.
// Use dictionary because it has efficient key matching.
var merged = englishResources.ToDictionary( 
    i => i.Key, 
    i => new LanguageResourceModel { 
            Name = i.Key, 
            EnglishValue = i.Value.Value 
        });

// Now merge the arabic ones. 
// You could LINQ-ify the whole thing, but it's not gonna make it more efficient.
LanguageResourceModel found;
foreach (var item in arabicResources)
{
    // If value already exists, update it
    if (merged.TryGetValue(item.Key, out found))
        found.ArabicValue = item.Value.Value;
    else // Otherwise, add a new one
        merged[item.Key] = new LanguageResourceModel { 
            Name = item.Key, 
            ArabicValue = item.Value.Value 
        };
}

答案 1 :(得分:1)

你想要的是一个更快的完整外部连接函数,最好是在LINQ语句中。

我将首先编写一个针对您的问题的解决方案。之后,我将编写一个通用解决方案,可用于您想要完全外连接的所有集合。

如果您不熟悉(内部)联接,分组联接,左外联接,全外联接等,请参阅:A visual explanation about joins

你的连接速度太慢的原因是因为对于集合A的每个元素,你检查集合B的每个元素以找到匹配的ID。这是浪费时间。

你需要一个快速查找:给定一个Id,哪个元素有这个Id?这通常是使用字典的一种情况。

加速完全外连接的函数,我假设你的table1是一系列T1对象,table2是一系列T2对象,你的结果表是一系列TResult对象。

IEnumerable<TResult> FullOuterJoin(IEnumerable<T1> table1, IEnumerable<T2> table2)
{
    // put table1 elements in a dictionary with Id as key
    // do the same for table2 elements
    Dictionary<int, T1> lookup1 = table1.ToDictionary(t1 => t1.Id);
    Dictionary<int, T2> lookup2 = table2.ToDictionary(t2 => t2.Id);

    // create a sequence of all Ids used in table1 and/or table2
    // remove duplicates using Distinct
    IEnumerable<int> allIdsInT1 = table1.Select(t1 => t1.Id);
    IEnumerable<int> allIdsInT2 = table2.Select(t2 => t2.Id);
    IEnumerable<int> allUsedIds = allIdsInT1
        .Concat(allIdsInT2)
        .Distinct();

    // now enumerate over all elements in allUsedIds.
    // find the matching element in lookup1
    // find the matching element in lookup2
    // if no match found: use null
    foreach (int id in allUsedIds)
    {
        // find the element with Id in lookup1; use null if there is no such element
        T1 found1;
        bool t1Found = lookup1.TryGetValue(id, out found1);
        if (!t1Found) found1 = null;

        // find the element with Id in lookup2; use null if there is no such element
        T2 found2;
        bool t2Found = lookup2.TryGetValue(id, out found2
        if (!t2Found) found2 = null;

        TResult result = new TResult()
        {
            Id = id,
            Table1Value = found1,
            Table2Value = found2,
        };
        yield return result;
    }

这将解决您的效率问题。

注意:我能够使用一个词典,因为我认为你的ID是唯一的。如果没有,请使用Lookup table.。我将在下面的LINQ示例中执行此操作

如果你经常需要这样的功能,可以考虑为Enumerable创建一个扩展函数,它对每两个集合和每种比较键以及每种类型的相等都做同样的事情。

有关扩展方法,请参阅:Extension Methods Demystified

创建一个需要两个序列的函数。您指定使用序列A的哪个属性以及序列B的哪个属性用于查找我们将加入的公共值。您还为键指定了相等比较器。如果不是,则使用默认比较器。

在完全外连接之后,你有一个键,一个与A匹配的元素序列, 以及B中与此键匹配的一系列元素。您可以指定如何处理这三个以创建结果。

扩展功能:

public static IEnumerable<TResult> FullOuterJoin<TA, TB, TKey, TResult>(
   IEnumerable<TA> sourceA,       // The first collection
   IEnumerable<TB> sourceB,       // The second collection
   Func<TA, TKey> keySelectorA,   //which property from A is the key to join on?
   Func<TB, TKey> keySelectorB,   //which property from B is the key to join on?
   Funct<TA, TB, TKey, TResult> resultSelector,
   // says what to return with the matching elements and the key

   TA defaultA = default(TA),     // use this value if no matching A is found
   TA defaultB = default(TB),     // use this value if no matching B is found
   IEqualityComparer<TKey> cmp = null)
   // the equality comparer used to check if key A equals key B)
  {
      // TODO implement
  }

最后三个参数有默认值。如果不指定它们,将使用常用的默认值。

在您的情况下,您将按如下方式使用它:

IEnumerable<T1> table1 = ...
IEnumerable<T2> table2 = ...

// Full Outer Join table1 and table2
IEnumerable<MyResult> ResultTable = table1.FullOuterJoin(table2,
   t1 => t1.Id,            // from every element of table1 take the Id
   t2 => t2.Id,            // from every element of table2 also take the Id

   // if you have a match t1, t2, (or default) with key, create a MyResult:
   (t1, t2, key, MyResult) => new MyResult()
   {
       Id = key,
       Table1Value = t1,
       Table2Value = t2
   });

当我使用最后三个参数的默认值时,未找到的元素将为null,并且使用默认的整数比较器。

实施与上述类似。由于我可能有几个匹配键的相同值,我将使用查找表。

// If no EqualityComparer is provided, use the default one:
cmp = cmp ?? EqualityComparer<TKey>.Default;

// create the two lookup tables:
ILookup<TKey, TA> alookup = sourceA.ToLookup(keySelectorA, cmp);
ILookup<TKey, TB> blookup = sourceB.ToLookup(keySelectorB, cmp);

// get a collection of all keys used in sourceA and/or sourceB.
// Remove duplicates using the equalityComparer
IEnumerable<TKey> allKeysUsedInA = sourceA.Select(a => keySelectorA(a));
IEnumerable<TKey> allKeysUsedInB = sourceB.Select(b => keySelectorB(b));
IEnumerable<TKey> allUsedKeys = allKeysUsedInA
    .Concat(allKeysUsedInB)
    .Distinct(cmp);

// now enumerate over all keys, get the matching elements from sourceA / sourceB 
// use defaults if not available
foreach (TKey key in allUsedKeys)
{
    // get all A elements with TKey, use the default value if the key is not found
    IEnumerable<TA> foundAs = alookup[key].DefaultIfEmpty(defaultA);
    foreach (TA foundA in foundAs)
    {
        // get all B elements with TKey, use the default value if the key is not found
        IEnumerable<TB> foundBs = blookup[key].DefaultIfEmpty(defaultB);
        foreach (TB foundB in foundBs)
        {
            TResult result = resultSelector(foundA, foundB, key);
            yield return result;
        }
    }
}

用法,你有一系列城市,每个城市都有一个ID和一个名字;我还有一系列街道,每个街道都有一个ID和一个名字。每条街道都属于一个拥有外键CityId的城市。显然,街道也有一个名字

从每个城市我都想要城市名称和城市街道的名称。

完整的外部联接将是:

var result = Cities.FullOuterJoin(Streets,
    city => city.Id,                    // from every city take the Id
    street => street.CityId             // from every street take the CityId
    (city, street, matchingId) => new   // when they match create a new object
    {                                   // using the matching city, street and  matching Id
        CityName = city.Name,           // with the name of the city
        StreetName = street.Name,       // and the name of the street
    }                                   // we don't use the matching id