我正在尝试使用Linq比较两个数据表。非常简单的表,只有一列但有大约44,000行。我使用以下内容但是当我跟踪它时,当它到达if(dr.Any())时,它只是坐在那里,下一行或异常永远不会被执行:
public static DataTable GetTableDiff(DataTable dt1, DataTable dt2)
{
DataTable dtDiff = new DataTable();
try
{
var dr = from r in dt1.AsEnumerable() where !dt2.AsEnumerable().Any(r2 => r["FacilityID"].ToString().Trim().ToLower() == r2["FacilityID"].ToString().Trim().ToLower()) select r;
if (dr.Any())
dtDiff = dr.CopyToDataTable();
}
catch (Exception ex)
{
}
return dtDiff;
}
我在web.config中设置了最大请求长度,以确保这不是问题,但没有变化:
<system.web>
<compilation debug="true" targetFramework="4.5" />
<httpRuntime targetFramework="4.5" maxRequestLength="1048576" />
我不认为44,000行太大了,是吗?
答案 0 :(得分:4)
连接表O(N1 + N2)而不是进行O(N1 * N2)搜索(目前dt1中的每一行都扫描dt2中的所有行):
DataCombined
通过加入,您还将只计算一次键(设施ID)。
另一个选择是创建简单的行比较器:
var diff = from r1 in dt1.AsEnumerable()
join r2 in dt2.AsEnumerable()
on r1.Field<string>("FacilityID").Trim().ToLower()
equals r2.Field<string>("FacilityID").Trim().ToLower() into g
where !g.Any() // get only rows which do not have joined rows from dt2
select r1;
然后获取新行是使用LINQ public class FacilityIdComparer : IEqualityComparer<DataRow>
{
public bool Equals(DataRow x, DataRow y) => GetFacilityID(x) == GetFacilityID(y);
public int GetHashCode(DataRow row) => GetFacilityID(row)?.GetHashCode() ?? 0;
private string GetFacilityID(DataRow row)
=> row.Field<string>("FacilityID")?.Trim().ToLower();
}
方法的一个班轮:
Except
它也适用于搜索交叉点
答案 1 :(得分:3)
我会使用一种不同的,更轻量级的方法,因为你只是从一个表中获取行,而你只想要那些有新FacilityId
的那些:
public static DataTable GetTableDiff(DataTable dtNew, DataTable dtOld)
{
DataTable dtDiff = dtNew.Clone(); // no data only columns and constraints
var oldFacilityIds = dtOld.AsEnumerable().Select(r => r.Field<string>("FacilityID").Trim());
var oldFacilityIDSet = new HashSet<string>(oldFacilityIds, StringComparer.CurrentCultureIgnoreCase);
var newRows = dtNew.AsEnumerable()
.Where(r => !oldFacilityIDSet.Contains(r.Field<string>("FacilityID").Trim()));
foreach (DataRow row in newRows)
dtDiff.ImportRow(row);
return dtDiff;
}