我允许用户将一些数据下载到csv。然后,他们可以编辑一些列,然后将其上传回来。我需要一种速度有效的方法来比较相似对象之间的某些列,看看有什么变化。
目前我从数据库中提取原始数据并将其作为一个列表,因此它全部都在内存中。有大约10万件物品所以它并没有那么糟糕。那部分需要不到一秒钟。然后我加载csv文件并将其放入列表。两个列表都具有相同的类类型。
然后我循环遍历csv数据(因为他们可能删除了一些他们没有改变的行,但他们仍然可以改变很多行)。对于csv列表中的每一行,我查询来自DB的列表以查找该对象。现在我将csv对象和数据库中的对象作为相同的结构。然后我通过一个自定义对象比较函数运行它,该函数查看某些列以查看是否有任何更改。
如果某些内容发生了变化,我必须通过查询该列的另一个引用列表来验证他们输入的内容是否为有效值。如果它无效,我将其写入例外列表。最后,如果没有例外,我保存到db。如果有例外,我不保存任何内容,并向他们显示错误列表。
详细比较提供了列的列表以及更改的旧值和新值。我需要这个来查询引用列表,以确保在我进行更改之前新值是有效的。它的效率相当低,但它为用户提供了非常有价值的上传问题。
这很慢。我正在寻找加快速度的方法,同时仍然能够向用户提供有关失败原因的详细信息,以便他们能够纠正错误。
// get all the new records from the csv
var newData = csv.GetRecords<MyTable>().ToArray();
// select all data from database to list
var origData = ctx.MyTable.Select(s => s).ToList();
// look for any changes in the new data and update the database. note we are looping over the new data so if they removed some data from the csv file it just won't loop over those and they won't change
foreach (var d in newData)
{
// find data so we can compare between new (csv) and current (from db) to see what possibly changed
var oData = (from o in origData
where o.id == d.id
select o).FirstOrDefault();
// only the columns in the updatableColumns list are compared
var diff = d.DetailedCompare(oData, comparableColumns.ToList());
if (diff.Count > 0)
{
// even though there are differences between the csv record and db record doesn't mean what the user input is valid. only existing ref data is valid and needs to be checked before a change is made
bool changed = false;
// make a copy of this original data and we'll check after if we actually were able to make a change to it (was the value provided valid)
var data = CopyRecord(oData);
// update this record's data fields that have changed with the new data
foreach (var v in diff)
{
// special check for setting a value to NULL as its always valid to do this but wouldn't show up in ref data to pass the next check below
if (v.valA == null)
{
oData.GetType().GetProperty(v.Prop).SetValue(oData, v.valA);
oData.UpdatedBy = user;
oData.UpdatedDate = DateTime.Now;
changed = true;
}
// validate that the value for this column is in the ref table before allowing an update. note exception if not so we can tell the user
else if (refData[v.Prop].Where(a => a.value == v.valA.ToString()).FirstOrDefault() != null)
{
// update the current objects values with the new objects value as it changed and is a valid value based on the ref data defined for that column
oData.GetType().GetProperty(v.Prop).SetValue(oData, v.valA);
oData.UpdatedBy = user;
oData.UpdatedDate = DateTime.Now;
changed = true;
}
else
{
// the value provided isn't valid for this column so note this to tell the user
exceptions.Add(string.Format("Error: ID: {0}, Value: '{1}' is not valid for column [{2}]. Add the reference data if needed and re-import.", d.id, v.valA, v.Prop));
}
}
// we only need to reattach and save off changes IF we actually changed something to a valid ref value and we had no exceptions for this record
if (changed && exceptions.Count == 0)
{
// because our current object was in memory we will reattached it to EF so we can mark it as changed and SaveChanges() will write it back to the DB
ctx.MyTable.Attach(oData);
ctx.Entry(oData).State = EntityState.Modified;
// add a history record for the change to this product
CreateHistoryRecord(data, user);
}
}
}
// wait until the very end before making DB changed. we don't save anything if there are exceptions or nothing changed
if (exceptions.Count == 0)
{
ctx.SaveChanges();
}
答案 0 :(得分:4)
第一个重大胜利是将您的数据放入字典中,这样您就可以通过ID快速获得所需的值,而无需通过数千个对象搜索对象。我很确定它会更快。
除此之外,我建议您通过分析器运行代码,以确定哪些部分最慢。完全有可能,DetailedCompare()做的事情非常缓慢,但可能并不明显。
答案 1 :(得分:0)
要考虑的一件事是异步比较和/或异步if (diff,Count > 0)
至少后者假设有一些随机变化为什么等待所有的复制和反射。把它放在一个seperatge函数中并运行并行。