我列出了人员的姓名和名字,以及人员姓名和姓氏的清单。有些人没有名字,有些人没有姓氏;我想在两个列表上进行完全外连接。
以下列表:
ID FirstName
-- ---------
1 John
2 Sue
ID LastName
-- --------
1 Doe
3 Smith
应该产生:
ID FirstName LastName
-- --------- --------
1 John Doe
2 Sue
3 Smith
我是LINQ的新手(如果我是跛脚的话,请原谅我)并找到了很多“LINQ Outer Joins”的解决方案,这些解决方案看起来非常相似,但实际上似乎是留下外连接。
到目前为止,我的尝试是这样的:
private void OuterJoinTest()
{
List<FirstName> firstNames = new List<FirstName>();
firstNames.Add(new FirstName { ID = 1, Name = "John" });
firstNames.Add(new FirstName { ID = 2, Name = "Sue" });
List<LastName> lastNames = new List<LastName>();
lastNames.Add(new LastName { ID = 1, Name = "Doe" });
lastNames.Add(new LastName { ID = 3, Name = "Smith" });
var outerJoin = from first in firstNames
join last in lastNames
on first.ID equals last.ID
into temp
from last in temp.DefaultIfEmpty()
select new
{
id = first != null ? first.ID : last.ID,
firstname = first != null ? first.Name : string.Empty,
surname = last != null ? last.Name : string.Empty
};
}
}
public class FirstName
{
public int ID;
public string Name;
}
public class LastName
{
public int ID;
public string Name;
}
但是这会回来:
ID FirstName LastName
-- --------- --------
1 John Doe
2 Sue
我做错了什么?
答案 0 :(得分:174)
更新1:提供真正通用的扩展方法FullOuterJoin
更新2:可选择接受密钥类型的自定义IEqualityComparer
更新3 :此实施已recently become part of MoreLinq
- 谢谢大家!
修改已添加FullOuterGroupJoin
(ideone)。我重用了GetOuter<>
实现,使得它的性能降低了一些,但我的目标是“高级”代码,而不是现在优化的前沿。
static void Main(string[] args)
{
var ax = new[] {
new { id = 1, name = "John" },
new { id = 2, name = "Sue" } };
var bx = new[] {
new { id = 1, surname = "Doe" },
new { id = 3, surname = "Smith" } };
ax.FullOuterJoin(bx, a => a.id, b => b.id, (a, b, id) => new {a, b})
.ToList().ForEach(Console.WriteLine);
}
打印输出:
{ a = { id = 1, name = John }, b = { id = 1, surname = Doe } }
{ a = { id = 2, name = Sue }, b = }
{ a = , b = { id = 3, surname = Smith } }
您还可以提供默认值: http://ideone.com/kG4kqO
ax.FullOuterJoin(
bx, a => a.id, b => b.id,
(a, b, id) => new { a.name, b.surname },
new { id = -1, name = "(no firstname)" },
new { id = -2, surname = "(no surname)" }
)
印刷:
{ name = John, surname = Doe }
{ name = Sue, surname = (no surname) }
{ name = (no firstname), surname = Smith }
加入是从关系数据库设计借来的术语:
a
中的元素次数b
中的元素与对应的键(即:{if> { {1}}是空的。 数据库术语调用此b
。inner (equi)join
的无对应的元素
元素存在于a
中。 (即:如果b
为空,则为偶数结果)。 这通常称为b
。left join
以及a
的记录,如果中没有相应的元素另一个。 (即使b
为空,结果甚至是结果)在RDBMS中看不到通常的东西是一个群组加入 [1] :
a
的多个对应a
的元素,它< em> groups 具有相应键的记录。当您希望根据公共密钥枚举“已加入”记录时,这通常会更方便。另请参阅GroupJoin,其中也包含一些一般背景说明。
[1] (我相信Oracle和MSSQL都有专有扩展)
此
的广义“插入式”扩展类b
答案 1 :(得分:107)
我不知道这是否涵盖了所有情况,从逻辑上看似乎是正确的。我们的想法是采用左外连接和右外连接,然后取结果的并集。
var firstNames = new[]
{
new { ID = 1, Name = "John" },
new { ID = 2, Name = "Sue" },
};
var lastNames = new[]
{
new { ID = 1, Name = "Doe" },
new { ID = 3, Name = "Smith" },
};
var leftOuterJoin =
from first in firstNames
join last in lastNames on first.ID equals last.ID into temp
from last in temp.DefaultIfEmpty()
select new
{
first.ID,
FirstName = first.Name,
LastName = last?.Name,
};
var rightOuterJoin =
from last in lastNames
join first in firstNames on last.ID equals first.ID into temp
from first in temp.DefaultIfEmpty()
select new
{
last.ID,
FirstName = first?.Name,
LastName = last.Name,
};
var fullOuterJoin = leftOuterJoin.Union(rightOuterJoin);
这是写的,因为它在LINQ to Objects中。如果LINQ to SQL或其他,查询处理器可能不支持安全导航或其他操作。您必须使用条件运算符来有条件地获取值。
即,
var leftOuterJoin =
from first in firstNames
join last in lastNames on first.ID equals last.ID into temp
from last in temp.DefaultIfEmpty()
select new
{
first.ID,
FirstName = first.Name,
LastName = last != null ? last.Name : default,
};
答案 2 :(得分:15)
我认为大多数这些问题都存在问题,包括已接受的答案,因为它们不能很好地与Linq一起使用IQueryable,因为服务器往返次数太多,数据返回太多,或客户端执行太多
对于IEnumerable我不喜欢Sehe的答案或类似因为它有过多的内存使用(一个简单的10000000双列表测试在我的32GB机器上运行了Linqpad内存)。
此外,大多数其他人实际上并没有实现正确的Full Outer Join,因为他们使用具有Right Join的Union而不是带有Right Anti Semi Join的Concat,这不仅消除了来自结果,但最初在左数据或右数据中存在的任何适当的重复。
所以这里有我的扩展来处理所有这些问题,生成SQL直接在Linq中实现连接,在服务器上执行,并且比Enumerables上的其他更快且内存更少:
public static class Ext {
public static IEnumerable<TResult> LeftOuterJoin<TLeft, TRight, TKey, TResult>(
this IEnumerable<TLeft> leftItems,
IEnumerable<TRight> rightItems,
Func<TLeft, TKey> leftKeySelector,
Func<TRight, TKey> rightKeySelector,
Func<TLeft, TRight, TResult> resultSelector) {
return from left in leftItems
join right in rightItems on leftKeySelector(left) equals rightKeySelector(right) into temp
from right in temp.DefaultIfEmpty()
select resultSelector(left, right);
}
public static IEnumerable<TResult> RightOuterJoin<TLeft, TRight, TKey, TResult>(
this IEnumerable<TLeft> leftItems,
IEnumerable<TRight> rightItems,
Func<TLeft, TKey> leftKeySelector,
Func<TRight, TKey> rightKeySelector,
Func<TLeft, TRight, TResult> resultSelector) {
return from right in rightItems
join left in leftItems on rightKeySelector(right) equals leftKeySelector(left) into temp
from left in temp.DefaultIfEmpty()
select resultSelector(left, right);
}
public static IEnumerable<TResult> FullOuterJoinDistinct<TLeft, TRight, TKey, TResult>(
this IEnumerable<TLeft> leftItems,
IEnumerable<TRight> rightItems,
Func<TLeft, TKey> leftKeySelector,
Func<TRight, TKey> rightKeySelector,
Func<TLeft, TRight, TResult> resultSelector) {
return leftItems.LeftOuterJoin(rightItems, leftKeySelector, rightKeySelector, resultSelector).Union(leftItems.RightOuterJoin(rightItems, leftKeySelector, rightKeySelector, resultSelector));
}
public static IEnumerable<TResult> RightAntiSemiJoin<TLeft, TRight, TKey, TResult>(
this IEnumerable<TLeft> leftItems,
IEnumerable<TRight> rightItems,
Func<TLeft, TKey> leftKeySelector,
Func<TRight, TKey> rightKeySelector,
Func<TLeft, TRight, TResult> resultSelector) where TLeft : class {
var hashLK = new HashSet<TKey>(from l in leftItems select leftKeySelector(l));
return rightItems.Where(r => !hashLK.Contains(rightKeySelector(r))).Select(r => resultSelector((TLeft)null,r));
}
public static IEnumerable<TResult> FullOuterJoin<TLeft, TRight, TKey, TResult>(
this IEnumerable<TLeft> leftItems,
IEnumerable<TRight> rightItems,
Func<TLeft, TKey> leftKeySelector,
Func<TRight, TKey> rightKeySelector,
Func<TLeft, TRight, TResult> resultSelector) where TLeft : class {
return leftItems.LeftOuterJoin(rightItems, leftKeySelector, rightKeySelector, resultSelector).Concat(leftItems.RightAntiSemiJoin(rightItems, leftKeySelector, rightKeySelector, resultSelector));
}
private static Expression<Func<TP, TC, TResult>> CastSMBody<TP, TC, TResult>(LambdaExpression ex, TP unusedP, TC unusedC, TResult unusedRes) => (Expression<Func<TP, TC, TResult>>)ex;
public static IQueryable<TResult> LeftOuterJoin<TLeft, TRight, TKey, TResult>(
this IQueryable<TLeft> leftItems,
IQueryable<TRight> rightItems,
Expression<Func<TLeft, TKey>> leftKeySelector,
Expression<Func<TRight, TKey>> rightKeySelector,
Expression<Func<TLeft, TRight, TResult>> resultSelector) where TLeft : class where TRight : class where TResult : class {
var sampleAnonLR = new { left = (TLeft)null, rightg = (IEnumerable<TRight>)null };
var parmP = Expression.Parameter(sampleAnonLR.GetType(), "p");
var parmC = Expression.Parameter(typeof(TRight), "c");
var argLeft = Expression.PropertyOrField(parmP, "left");
var newleftrs = CastSMBody(Expression.Lambda(Expression.Invoke(resultSelector, argLeft, parmC), parmP, parmC), sampleAnonLR, (TRight)null, (TResult)null);
return leftItems.AsQueryable().GroupJoin(rightItems, leftKeySelector, rightKeySelector, (left, rightg) => new { left, rightg }).SelectMany(r => r.rightg.DefaultIfEmpty(), newleftrs);
}
public static IQueryable<TResult> RightOuterJoin<TLeft, TRight, TKey, TResult>(
this IQueryable<TLeft> leftItems,
IQueryable<TRight> rightItems,
Expression<Func<TLeft, TKey>> leftKeySelector,
Expression<Func<TRight, TKey>> rightKeySelector,
Expression<Func<TLeft, TRight, TResult>> resultSelector) where TLeft : class where TRight : class where TResult : class {
var sampleAnonLR = new { leftg = (IEnumerable<TLeft>)null, right = (TRight)null };
var parmP = Expression.Parameter(sampleAnonLR.GetType(), "p");
var parmC = Expression.Parameter(typeof(TLeft), "c");
var argRight = Expression.PropertyOrField(parmP, "right");
var newrightrs = CastSMBody(Expression.Lambda(Expression.Invoke(resultSelector, parmC, argRight), parmP, parmC), sampleAnonLR, (TLeft)null, (TResult)null);
return rightItems.GroupJoin(leftItems, rightKeySelector, leftKeySelector, (right, leftg) => new { leftg, right }).SelectMany(l => l.leftg.DefaultIfEmpty(), newrightrs);
}
public static IQueryable<TResult> FullOuterJoinDistinct<TLeft, TRight, TKey, TResult>(
this IQueryable<TLeft> leftItems,
IQueryable<TRight> rightItems,
Expression<Func<TLeft, TKey>> leftKeySelector,
Expression<Func<TRight, TKey>> rightKeySelector,
Expression<Func<TLeft, TRight, TResult>> resultSelector) where TLeft : class where TRight : class where TResult : class {
return leftItems.LeftOuterJoin(rightItems, leftKeySelector, rightKeySelector, resultSelector).Union(leftItems.RightOuterJoin(rightItems, leftKeySelector, rightKeySelector, resultSelector));
}
private static Expression<Func<TP, TResult>> CastSBody<TP, TResult>(LambdaExpression ex, TP unusedP, TResult unusedRes) => (Expression<Func<TP, TResult>>)ex;
public static IQueryable<TResult> RightAntiSemiJoin<TLeft, TRight, TKey, TResult>(
this IQueryable<TLeft> leftItems,
IQueryable<TRight> rightItems,
Expression<Func<TLeft, TKey>> leftKeySelector,
Expression<Func<TRight, TKey>> rightKeySelector,
Expression<Func<TLeft, TRight, TResult>> resultSelector) where TLeft : class where TRight : class where TResult : class {
var sampleAnonLgR = new { leftg = (IEnumerable<TLeft>)null, right = (TRight)null };
var parmLgR = Expression.Parameter(sampleAnonLgR.GetType(), "lgr");
var argLeft = Expression.Constant(null, typeof(TLeft));
var argRight = Expression.PropertyOrField(parmLgR, "right");
var newrightrs = CastSBody(Expression.Lambda(Expression.Invoke(resultSelector, argLeft, argRight), parmLgR), sampleAnonLgR, (TResult)null);
return rightItems.GroupJoin(leftItems, rightKeySelector, leftKeySelector, (right, leftg) => new { leftg, right }).Where(lgr => !lgr.leftg.Any()).Select(newrightrs);
}
public static IQueryable<TResult> FullOuterJoin<TLeft, TRight, TKey, TResult>(
this IQueryable<TLeft> leftItems,
IQueryable<TRight> rightItems,
Expression<Func<TLeft, TKey>> leftKeySelector,
Expression<Func<TRight, TKey>> rightKeySelector,
Expression<Func<TLeft, TRight, TResult>> resultSelector) where TLeft : class where TRight : class where TResult : class {
return leftItems.LeftOuterJoin(rightItems, leftKeySelector, rightKeySelector, resultSelector).Concat(leftItems.RightAntiSemiJoin(rightItems, leftKeySelector, rightKeySelector, resultSelector));
}
}
Right Anti-Semi-Join之间的区别主要是Linq to Objects或源代码,但在最终答案的服务器(SQL)方面有所不同,删除了不必要的JOIN
。
使用LinqKit可以改进Expression
处理将Expression<Func<>>
合并为lambda的手动编码,但如果语言/编译器为此添加了一些帮助,那将会很好。包含FullOuterJoinDistinct
和RightOuterJoin
函数是为了完整性,但我还没有重新实现FullOuterGroupJoin
。
我为IEnumerable
写了一个完全外连接的another version,用于可以订购密钥的情况,这比将左外连接与右反半连接组合快约50%,至少在小集合上。它只在排序一次后通过每个集合。
答案 3 :(得分:7)
这是一个扩展方法:
public static IEnumerable<KeyValuePair<TLeft, TRight>> FullOuterJoin<TLeft, TRight>(this IEnumerable<TLeft> leftItems, Func<TLeft, object> leftIdSelector, IEnumerable<TRight> rightItems, Func<TRight, object> rightIdSelector)
{
var leftOuterJoin = from left in leftItems
join right in rightItems on leftIdSelector(left) equals rightIdSelector(right) into temp
from right in temp.DefaultIfEmpty()
select new { left, right };
var rightOuterJoin = from right in rightItems
join left in leftItems on rightIdSelector(right) equals leftIdSelector(left) into temp
from left in temp.DefaultIfEmpty()
select new { left, right };
var fullOuterJoin = leftOuterJoin.Union(rightOuterJoin);
return fullOuterJoin.Select(x => new KeyValuePair<TLeft, TRight>(x.left, x.right));
}
答案 4 :(得分:6)
正如您所发现的,Linq没有“外连接”结构。您可以获得的最接近的是使用您所述查询的左外连接。为此,您可以添加姓氏列表中未在联接中表示的任何元素:
outerJoin = outerJoin.Concat(lastNames.Select(l=>new
{
id = l.ID,
firstname = String.Empty,
surname = l.Name
}).Where(l=>!outerJoin.Any(o=>o.id == l.id)));
答案 5 :(得分:4)
我猜@ sehe的方法更强,但在我更好地理解之前,我发现自己从@ MichaelSander的扩展中跳出来。我修改它以匹配描述here的内置Enumerable.Join()方法的语法和返回类型。我在@ JeffMercado的解决方案下附加了关于@ cadrell0评论的“distinct”后缀。
public static class MyExtensions {
public static IEnumerable<TResult> FullJoinDistinct<TLeft, TRight, TKey, TResult> (
this IEnumerable<TLeft> leftItems,
IEnumerable<TRight> rightItems,
Func<TLeft, TKey> leftKeySelector,
Func<TRight, TKey> rightKeySelector,
Func<TLeft, TRight, TResult> resultSelector
) {
var leftJoin =
from left in leftItems
join right in rightItems
on leftKeySelector(left) equals rightKeySelector(right) into temp
from right in temp.DefaultIfEmpty()
select resultSelector(left, right);
var rightJoin =
from right in rightItems
join left in leftItems
on rightKeySelector(right) equals leftKeySelector(left) into temp
from left in temp.DefaultIfEmpty()
select resultSelector(left, right);
return leftJoin.Union(rightJoin);
}
}
在示例中,您可以像这样使用它:
var test =
firstNames
.FullJoinDistinct(
lastNames,
f=> f.ID,
j=> j.ID,
(f,j)=> new {
ID = f == null ? j.ID : f.ID,
leftName = f == null ? null : f.Name,
rightName = j == null ? null : j.Name
}
);
将来,随着我了解更多,我有一种感觉,我会转移到@ sehe的逻辑,因为它的受欢迎程度。但即使这样,我也要小心,因为我认为至少有一个重载符合现有“.Join()”方法的语法,如果可行,这有两个原因:
对于泛型,扩展,Func语句和其他功能,我还是新手,所以反馈当然是受欢迎的。
编辑:我没有花很长时间才意识到我的代码存在问题。我在LINQPad中做一个.Dump()并查看返回类型。它只是IEnumerable,所以我试着匹配它。但是当我在我的扩展名上实际执行了.Where()或.Select()时,我得到了一个错误:“'System Collections.IEnumerable'不包含'Select'和...的定义”。所以最后我能够匹配.Join()的输入语法,但不能匹配返回行为。
编辑:在函数的返回类型中添加了“TResult”。在阅读微软文章时错过了,当然这是有道理的。有了这个修复,现在似乎返回行为完全符合我的目标。
答案 6 :(得分:1)
我喜欢这些答案,但它不使用延迟执行(输入序列是由ToLookup调用急切枚举的)。因此,在查看LINQ-to-objects的.NET源代码后,我想出了这个:
public static class LinqExtensions
{
public static IEnumerable<TResult> FullOuterJoin<TLeft, TRight, TKey, TResult>(
this IEnumerable<TLeft> left,
IEnumerable<TRight> right,
Func<TLeft, TKey> leftKeySelector,
Func<TRight, TKey> rightKeySelector,
Func<TLeft, TRight, TKey, TResult> resultSelector,
IEqualityComparer<TKey> comparator = null,
TLeft defaultLeft = default(TLeft),
TRight defaultRight = default(TRight))
{
if (left == null) throw new ArgumentNullException("left");
if (right == null) throw new ArgumentNullException("right");
if (leftKeySelector == null) throw new ArgumentNullException("leftKeySelector");
if (rightKeySelector == null) throw new ArgumentNullException("rightKeySelector");
if (resultSelector == null) throw new ArgumentNullException("resultSelector");
comparator = comparator ?? EqualityComparer<TKey>.Default;
return FullOuterJoinIterator(left, right, leftKeySelector, rightKeySelector, resultSelector, comparator, defaultLeft, defaultRight);
}
internal static IEnumerable<TResult> FullOuterJoinIterator<TLeft, TRight, TKey, TResult>(
this IEnumerable<TLeft> left,
IEnumerable<TRight> right,
Func<TLeft, TKey> leftKeySelector,
Func<TRight, TKey> rightKeySelector,
Func<TLeft, TRight, TKey, TResult> resultSelector,
IEqualityComparer<TKey> comparator,
TLeft defaultLeft,
TRight defaultRight)
{
var leftLookup = left.ToLookup(leftKeySelector, comparator);
var rightLookup = right.ToLookup(rightKeySelector, comparator);
var keys = leftLookup.Select(g => g.Key).Union(rightLookup.Select(g => g.Key), comparator);
foreach (var key in keys)
foreach (var leftValue in leftLookup[key].DefaultIfEmpty(defaultLeft))
foreach (var rightValue in rightLookup[key].DefaultIfEmpty(defaultRight))
yield return resultSelector(leftValue, rightValue, key);
}
}
此实现具有以下重要属性:
这些属性很重要,因为它们是FullOuterJoin的新手,但对LINQ有经验的人会期望。
答案 7 :(得分:1)
在两个输入上执行内存中的流枚举,并为每一行调用选择器。如果当前迭代中没有相关性,其中一个选择器参数将为null 。
示例:
var result = left.FullOuterJoin(
right,
x=>left.Key,
x=>right.Key,
(l,r) => new { LeftKey = l?.Key, RightKey=r?.Key });
需要IComparer作为相关类型,如果没有提供,则使用Comparer.Default。
要求'OrderBy'应用于输入枚举
/// <summary>
/// Performs a full outer join on two <see cref="IEnumerable{T}" />.
/// </summary>
/// <typeparam name="TLeft"></typeparam>
/// <typeparam name="TValue"></typeparam>
/// <typeparam name="TRight"></typeparam>
/// <typeparam name="TResult"></typeparam>
/// <param name="left"></param>
/// <param name="right"></param>
/// <param name="leftKeySelector"></param>
/// <param name="rightKeySelector"></param>
/// <param name="selector">Expression defining result type</param>
/// <param name="keyComparer">A comparer if there is no default for the type</param>
/// <returns></returns>
[System.Diagnostics.DebuggerStepThrough]
public static IEnumerable<TResult> FullOuterJoin<TLeft, TRight, TValue, TResult>(
this IEnumerable<TLeft> left,
IEnumerable<TRight> right,
Func<TLeft, TValue> leftKeySelector,
Func<TRight, TValue> rightKeySelector,
Func<TLeft, TRight, TResult> selector,
IComparer<TValue> keyComparer = null)
where TLeft: class
where TRight: class
where TValue : IComparable
{
keyComparer = keyComparer ?? Comparer<TValue>.Default;
using (var enumLeft = left.OrderBy(leftKeySelector).GetEnumerator())
using (var enumRight = right.OrderBy(rightKeySelector).GetEnumerator())
{
var hasLeft = enumLeft.MoveNext();
var hasRight = enumRight.MoveNext();
while (hasLeft || hasRight)
{
var currentLeft = enumLeft.Current;
var valueLeft = hasLeft ? leftKeySelector(currentLeft) : default(TValue);
var currentRight = enumRight.Current;
var valueRight = hasRight ? rightKeySelector(currentRight) : default(TValue);
int compare =
!hasLeft ? 1
: !hasRight ? -1
: keyComparer.Compare(valueLeft, valueRight);
switch (compare)
{
case 0:
// The selector matches. An inner join is achieved
yield return selector(currentLeft, currentRight);
hasLeft = enumLeft.MoveNext();
hasRight = enumRight.MoveNext();
break;
case -1:
yield return selector(currentLeft, default(TRight));
hasLeft = enumLeft.MoveNext();
break;
case 1:
yield return selector(default(TLeft), currentRight);
hasRight = enumRight.MoveNext();
break;
}
}
}
}
答案 8 :(得分:1)
我决定将此作为一个单独的答案添加,因为我不肯定它已经足够测试了。这是FullOuterJoin
方法的重新实现,基本上使用了LINQKit
的{{1}} Invoke
/ Expand
的简化自定义版本,以便它可以正常运行实体框架。没有太多解释,因为它与我之前的答案几乎相同。
Expression
答案 9 :(得分:1)
对于两种枚举键都是唯一的情况,我的解决方案很干净:
private static IEnumerable<TResult> FullOuterJoin<Ta, Tb, TKey, TResult>(
IEnumerable<Ta> a, IEnumerable<Tb> b,
Func<Ta, TKey> key_a, Func<Tb, TKey> key_b,
Func<Ta, Tb, TResult> selector)
{
var alookup = a.ToLookup(key_a);
var blookup = b.ToLookup(key_b);
var keys = new HashSet<TKey>(alookup.Select(p => p.Key));
keys.UnionWith(blookup.Select(p => p.Key));
return keys.Select(key => selector(alookup[key].FirstOrDefault(), blookup[key].FirstOrDefault()));
}
如此
var ax = new[] {
new { id = 1, first_name = "ali" },
new { id = 2, first_name = "mohammad" } };
var bx = new[] {
new { id = 1, last_name = "rezaei" },
new { id = 3, last_name = "kazemi" } };
var list = FullOuterJoin(ax, bx, a => a.id, b => b.id, (a, b) => "f: " + a?.first_name + " l: " + b?.last_name).ToArray();
输出:
f: ali l: rezaei
f: mohammad l:
f: l: kazemi
答案 10 :(得分:0)
我可能在6年前为一个应用程序编写了这个扩展类,并且从那以后一直在使用它,在许多解决方案中没有问题。希望它有所帮助。
public static class JoinExtensions
{
public static IEnumerable<TResult> FullOuterJoin<TOuter, TInner, TKey, TResult>(
this IEnumerable<TOuter> outer,
IEnumerable<TInner> inner,
Func<TOuter, TKey> outerKeySelector,
Func<TInner, TKey> innerKeySelector,
Func<TOuter, TInner, TResult> resultSelector)
where TInner : class
where TOuter : class
{
var innerLookup = inner.ToLookup(innerKeySelector);
var outerLookup = outer.ToLookup(outerKeySelector);
var innerJoinItems = inner
.Where(innerItem => !outerLookup.Contains(innerKeySelector(innerItem)))
.Select(innerItem => resultSelector(null, innerItem));
return outer
.SelectMany(outerItem =>
{
var innerItems = innerLookup[outerKeySelector(outerItem)];
return innerItems.Any() ? innerItems : new TInner[] { null };
}, resultSelector)
.Concat(innerJoinItems);
}
public static IEnumerable<TResult> LeftJoin<TOuter, TInner, TKey, TResult>(
this IEnumerable<TOuter> outer,
IEnumerable<TInner> inner,
Func<TOuter, TKey> outerKeySelector,
Func<TInner, TKey> innerKeySelector,
Func<TOuter, TInner, TResult> resultSelector)
{
return outer.GroupJoin(
inner,
outerKeySelector,
innerKeySelector,
(o, i) =>
new { o = o, i = i.DefaultIfEmpty() })
.SelectMany(m => m.i.Select(inn =>
resultSelector(m.o, inn)
));
}
public static IEnumerable<TResult> RightJoin<TOuter, TInner, TKey, TResult>(
this IEnumerable<TOuter> outer,
IEnumerable<TInner> inner,
Func<TOuter, TKey> outerKeySelector,
Func<TInner, TKey> innerKeySelector,
Func<TOuter, TInner, TResult> resultSelector)
{
return inner.GroupJoin(
outer,
innerKeySelector,
outerKeySelector,
(i, o) =>
new { i = i, o = o.DefaultIfEmpty() })
.SelectMany(m => m.o.Select(outt =>
resultSelector(outt, m.i)
));
}
}
答案 11 :(得分:0)
两个或多个表的完整外部联接: 首先提取要加入的列。
var DatesA = from A in db.T1 select A.Date;
var DatesB = from B in db.T2 select B.Date;
var DatesC = from C in db.T3 select C.Date;
var Dates = DatesA.Union(DatesB).Union(DatesC);
然后在提取的列和主表之间使用左外部联接。
var Full_Outer_Join =
(from A in Dates
join B in db.T1
on A equals B.Date into AB
from ab in AB.DefaultIfEmpty()
join C in db.T2
on A equals C.Date into ABC
from abc in ABC.DefaultIfEmpty()
join D in db.T3
on A equals D.Date into ABCD
from abcd in ABCD.DefaultIfEmpty()
select new { A, ab, abc, abcd })
.AsEnumerable();
答案 12 :(得分:0)
我认为LINQ join子句不是此问题的正确解决方案,因为join子句的目的不是按照此任务解决方案所需的方式来累积数据。合并创建的单独集合的代码变得太复杂了,也许出于学习目的是可以的,但对于实际应用程序却不是。下面的代码是解决此问题的方法之一:
class Program
{
static void Main(string[] args)
{
List<FirstName> firstNames = new List<FirstName>();
firstNames.Add(new FirstName { ID = 1, Name = "John" });
firstNames.Add(new FirstName { ID = 2, Name = "Sue" });
List<LastName> lastNames = new List<LastName>();
lastNames.Add(new LastName { ID = 1, Name = "Doe" });
lastNames.Add(new LastName { ID = 3, Name = "Smith" });
HashSet<int> ids = new HashSet<int>();
foreach (var name in firstNames)
{
ids.Add(name.ID);
}
foreach (var name in lastNames)
{
ids.Add(name.ID);
}
List<FullName> fullNames = new List<FullName>();
foreach (int id in ids)
{
FullName fullName = new FullName();
fullName.ID = id;
FirstName firstName = firstNames.Find(f => f.ID == id);
fullName.FirstName = firstName != null ? firstName.Name : string.Empty;
LastName lastName = lastNames.Find(l => l.ID == id);
fullName.LastName = lastName != null ? lastName.Name : string.Empty;
fullNames.Add(fullName);
}
}
}
public class FirstName
{
public int ID;
public string Name;
}
public class LastName
{
public int ID;
public string Name;
}
class FullName
{
public int ID;
public string FirstName;
public string LastName;
}
如果对于HashSet形成而言,实际集合很大,则可以使用foreach循环使用以下代码:
List<int> firstIds = firstNames.Select(f => f.ID).ToList();
List<int> LastIds = lastNames.Select(l => l.ID).ToList();
HashSet<int> ids = new HashSet<int>(firstIds.Union(LastIds));//Only unique IDs will be included in HashSet
答案 13 :(得分:0)
谢谢大家的有趣帖子!
我修改了代码,因为我需要这样做
对于感兴趣的人,这是我修改的代码(在VB中,抱歉)
Module MyExtensions
<Extension()>
Friend Function FullOuterJoin(Of TA, TB, TResult)(ByVal a As IEnumerable(Of TA), ByVal b As IEnumerable(Of TB), ByVal joinPredicate As Func(Of TA, TB, Boolean), ByVal projection As Func(Of TA, TB, TResult), ByVal comparer As IEqualityComparer(Of TResult)) As IEnumerable(Of TResult)
Dim joinL =
From xa In a
From xb In b.Where(Function(x) joinPredicate(xa, x)).DefaultIfEmpty()
Select projection(xa, xb)
Dim joinR =
From xb In b
From xa In a.Where(Function(x) joinPredicate(x, xb)).DefaultIfEmpty()
Select projection(xa, xb)
Return joinL.Union(joinR, comparer)
End Function
End Module
Dim fullOuterJoin = lefts.FullOuterJoin(
rights,
Function(left, right) left.Code = right.Code And (left.Amount [...] Or left.Description.Contains [...]),
Function(left, right) New CompareResult(left, right),
New MyEqualityComparer
)
Public Class MyEqualityComparer
Implements IEqualityComparer(Of CompareResult)
Private Function GetMsg(obj As CompareResult) As String
Dim msg As String = ""
msg &= obj.Code & "_"
[...]
Return msg
End Function
Public Overloads Function Equals(x As CompareResult, y As CompareResult) As Boolean Implements IEqualityComparer(Of CompareResult).Equals
Return Me.GetMsg(x) = Me.GetMsg(y)
End Function
Public Overloads Function GetHashCode(obj As CompareResult) As Integer Implements IEqualityComparer(Of CompareResult).GetHashCode
Return Me.GetMsg(obj).GetHashCode
End Function
End Class
答案 14 :(得分:0)
又一个完全外部联接
由于对其他命题的简单性和可读性不满意,我最终得出了这样的结论:
它没有快速的自负(在2020m CPU上加入1000 * 1000大约需要800毫秒:2.4ghz / 2cores)。对我来说,这只是一个紧凑而随意的完全外部联接。
它的作用与SQL FULL OUTER JOIN(重复保存)相同
欢呼;-)
>>> df1.apply(lambda row: df2.index[((df2[df1.columns] - row) >= 0).all(axis = 1)], axis = 1)
0 Index(['one', 'two'], dtype='object')
1 Index(['one', 'two', 'three'], dtype='object')
2 Index(['one'], dtype='object')
想法是
这是一个简短的测试:
在端点处放置一个断点以手动验证其行为是否符合预期
using System;
using System.Collections.Generic;
using System.Linq;
namespace NS
{
public static class DataReunion
{
public static List<Tuple<T1, T2>> FullJoin<T1, T2, TKey>(List<T1> List1, Func<T1, TKey> KeyFunc1, List<T2> List2, Func<T2, TKey> KeyFunc2)
{
List<Tuple<T1, T2>> result = new List<Tuple<T1, T2>>();
Tuple<TKey, T1>[] identifiedList1 = List1.Select(_ => Tuple.Create(KeyFunc1(_), _)).OrderBy(_ => _.Item1).ToArray();
Tuple<TKey, T2>[] identifiedList2 = List2.Select(_ => Tuple.Create(KeyFunc2(_), _)).OrderBy(_ => _.Item1).ToArray();
identifiedList1.Where(_ => !identifiedList2.Select(__ => __.Item1).Contains(_.Item1)).ToList().ForEach(_ => {
result.Add(Tuple.Create<T1, T2>(_.Item2, default(T2)));
});
result.AddRange(
identifiedList1.Join(identifiedList2, left => left.Item1, right => right.Item1, (left, right) => Tuple.Create<T1, T2>(left.Item2, right.Item2)).ToList()
);
identifiedList2.Where(_ => !identifiedList1.Select(__ => __.Item1).Contains(_.Item1)).ToList().ForEach(_ => {
result.Add(Tuple.Create<T1, T2>(default(T1), _.Item2));
});
return result;
}
}
}
}
答案 15 :(得分:-4)
我真的很讨厌这些linq表达式,这就是SQL存在的原因:
select isnull(fn.id, ln.id) as id, fn.firstname, ln.lastname
from firstnames fn
full join lastnames ln on ln.id=fn.id
在数据库中将其创建为sql视图,并将其作为实体导入。
当然,左右连接的(不同)联合也会成为它,但它是愚蠢的。