最有效的方法是将所有重复记录都记录到数据表中的特定记录中

时间:2017-10-26 10:42:33

标签: c# asp.net-mvc validation datatable

我正在尝试验证内容和excel文件,并且我已设法将其内容读取到数据表中。我想循环遍历每一行,并将每个特定行的重复行确定为Record类列表。

以下是我打算让用户看到错误的方法:

规则:域名和用户名必须不同

输入Excel / Datatable

Domain  Username Details
x       sam1     1234     //row1
x       jack1    4412     //row2
x       sam1     1233     //row3
y       jack1    4442     //row4
z       jason1   5522     //row5
x       sam1     8949     //row6

视野中的输出

无效条目:

Row   Domain  Username Details  ErrorMessage
 1    x       sam1     1234     Duplicate row for r3, r6
 3    x       sam1     1233     Duplicate row for r1, r6                       
 6    x       sam1     8949     Duplicate row for r1, r3 

有效条目:

Row    Domain  Username Details  
 2      x       jack1    4412 //not duplicate with row4 because different domain
 4      y       jack1    4442 //row4
 5      z       jason1   5522

在后面的代码中我计划创建一个我创建的类的List,它将在视图中使用。

public class Record
    {
        public long Row { get; set; }
        public string Domain { get; set; }
        public string Username { get; set; }
        public string Details { get; set; }
        public string ErrorMessage{ get; set; }
        public bool IsValid { get;set; }
    }

代码背后:

DataTable excelDataTable = ... //read excel into datatable
foreach (DataRow row in excelDataTable.Rows) //loop through each row of the datatable
{
StringBuilder errorStringBuilder = new StringBuilder(); //stores string error message because there would be more validations
Record record = new Record() {
Domain = row["Domain"].ToString(),
Username = row["Username "].ToString(),
Details = row["Details"].ToString(),
};

if(most efficient way to determine if row has duplicates which returns bool)
{
    //most efficient way to get the other rows that the current row duplicates
    errorStringBuilder.Append("Duplicate with: "..)//append rows that it duplicates

}
if(errorStringBuilder.Length != 0)
{
   record.ErrorMessage = sb.toString();
   record.IsValid = false;
}

}

返回列表

return new List().Add(
    new Record(){Row=1, Domain=x, Username=sam1, Details=1234, ErrorMessage=Duplicate row for r3, r6,IsValid=false}
    new Record(){Row=2, Domain=x, Username=jack1, Details=4412, ErrorMessage=,IsValid=true}
    new Record(){Row=3, Domain=x, Username=sam1, Details=1233, ErrorMessage=Duplicate row for r3, r6,IsValid=false}
    new Record(){Row=4, Domain=y, Username=jack1, Details=4442, ErrorMessage=,IsValid=true}
    new Record(){Row=5, Domain=z, Username=jason1, Details=5522, ErrorMessage=,IsValid=true}
    new Record(){Row=6, Domain=x, Username=sam1, Details=8949, ErrorMessage=Duplicate row for r3, r6,IsValid=false});

使用linq会很好。无论如何,我只想要最有效的方式。

1 个答案:

答案 0 :(得分:2)

您可以使用有效的方法Enumerable.GroupBy

var usernameDomainGroups = excelDataTable.AsEnumerable()
  .Select((r, index) => new { Row = row, RowNumber = index + 1})
  .GroupBy(x => new {UserName = x.Row.Field<string>("Username"), Domain = x.Row.Field<string>("Domain")});

在这些组中,您将找到所有行,重复行和唯一行。您可以使用循环来处理它们并构建记录列表。棘手的部分是生成重复的信息:

foreach (var group in usernameDomainGroups)
{
    bool isValid = !group.Skip(1).Any();
    if (isValid)
        recordList.Add(new Record
        {
            Row = group.First().RowNumber,
            Domain = group.Key.Domain,
            Username = group.Key.UserName,
            Details = group.First().Row.Field<string>("Details"),
            IsValid = true,
            ErrorMessage = null
        });
    else
    {
        // tricky part, if you need further help ask
        // start with group.Select(r => new Record ....
        // then you can use recordList.AddRange(...)
    }
}