从数据表中删除重复条目的最佳方法

时间:2010-12-11 06:26:15

标签: c# .net datatable duplicate-data

从数据表中删除重复条目的最佳方法是什么?

11 个答案:

答案 0 :(得分:70)

对当前正在使用的DataTable执行dtEmp

DataTable distinctTable = dtEmp.DefaultView.ToTable( /*distinct*/ true);

很好。

答案 1 :(得分:67)

删除重复项

public DataTable RemoveDuplicateRows(DataTable dTable, string colName)
{
   Hashtable hTable = new Hashtable();
   ArrayList duplicateList = new ArrayList();

   //Add list of all the unique item value to hashtable, which stores combination of key, value pair.
   //And add duplicate item value in arraylist.
   foreach (DataRow drow in dTable.Rows)
   {
      if (hTable.Contains(drow[colName]))
         duplicateList.Add(drow);
      else
         hTable.Add(drow[colName], string.Empty); 
   }

   //Removing a list of duplicate items from datatable.
   foreach (DataRow dRow in duplicateList)
      dTable.Rows.Remove(dRow);

   //Datatable which contains unique records will be return as output.
      return dTable;
}

以下链接

http://www.dotnetspider.com/resources/4535-Remove-duplicate-records-from-table.aspx

http://www.dotnetspark.com/kb/94-remove-duplicate-rows-value-from-datatable.aspx

用于删除列

中的重复项

http://dotnetguts.blogspot.com/2007/02/removing-duplicate-records-from.html

答案 2 :(得分:13)

这篇文章是关于从多个列中仅从数据表中获取Distincts行。

Public coid removeDuplicatesRows(DataTable dt)
{
  DataTable uniqueCols = dt.DefaultView.ToTable(true, "RNORFQNo", "ManufacturerPartNo",  "RNORFQId", "ItemId", "RNONo", "Quantity", "NSNNo", "UOMName", "MOQ", "ItemDescription");
} 

您需要调用此方法,并且需要为datatable指定值。 在上面的代码中,我们有RNORFQNo,PartNo,RFQ id,ItemId,RNONo,QUANTity,NSNNO,UOMName,MOQ和Item Description作为我们想要不同值的列。

答案 3 :(得分:10)

一种简单的方法是:

 var newDt= dt.AsEnumerable()
                 .GroupBy(x => x.Field<int>("ColumnName"))
                 .Select(y => y.First())
                 .CopyToDataTable();

答案 4 :(得分:3)

使用Linq GroupBy方法有一种简单的方法。

var duplicateValues = dt.AsEnumerable() 

        .GroupBy(row => row[0]) 

        .Where(group => (group.Count() == 1 || group.Count() > 1)) 

        .Select(g => g.Key); 



foreach (var d in duplicateValues)

        Console.WriteLine(d);

答案 5 :(得分:2)

使用AsEnumerable().Distinct()

,这是一种简单快捷的方式
private DataTable RemoveDuplicatesRecords(DataTable dt)
{
    //Returns just 5 unique rows
    var UniqueRows = dt.AsEnumerable().Distinct(DataRowComparer.Default);
    DataTable dt2 = UniqueRows.CopyToDataTable();
    return dt2;
}

CLICK to visit my blog for more Detail

答案 6 :(得分:1)

    /* To eliminate Duplicate rows */
    private void RemoveDuplicates(DataTable dt)
    {

        if (dt.Rows.Count > 0)
        {
            for (int i = dt.Rows.Count - 1; i >= 0; i--)
            {
                if (i == 0)
                {
                    break;
                }
                for (int j = i - 1; j >= 0; j--)
                {
                    if (Convert.ToInt32(dt.Rows[i]["ID"]) == Convert.ToInt32(dt.Rows[j]["ID"]) && dt.Rows[i]["Name"].ToString() == dt.Rows[j]["Name"].ToString())
                    {
                        dt.Rows[i].Delete();
                        break;
                    }
                }
            }
            dt.AcceptChanges();
        }
    }

答案 7 :(得分:0)

完全不同的行:

public static DataTable Dictinct(this dt) => dt.DefaultView.ToTable(true);

与特定行不同(请注意,“distinctCulumnNames”中提到的列将在结果DataTable中返回):

public static DataTable Dictinct(this dt, params string[] distinctColumnNames) => 
dt.DefaultView.ToTable(true, distinctColumnNames);

与特定列不同(保留给定DataTable中的所有列):

public static void Distinct(this DataTable dataTable, string distinctColumnName)
{
    var distinctResult = new DataTable();
    distinctResult.Merge(
                     .GroupBy(row => row.Field<object>(distinctColumnName))
                     .Select(group => group.First())
                     .CopyToDataTable()
            );

    if (distinctResult.DefaultView.Count < dataTable.DefaultView.Count)
    {
        dataTable.Clear();
        dataTable.Merge(distinctResult);
        dataTable.AcceptChanges();
    }
}

答案 8 :(得分:0)

您可以使用DataTable的DefaultView.ToTable方法进行这样的过滤(适应C#):

 Public Sub RemoveDuplicateRows(ByRef rDataTable As DataTable)
    Dim pNewDataTable As DataTable
    Dim pCurrentRowCopy As DataRow
    Dim pColumnList As New List(Of String)
    Dim pColumn As DataColumn

    'Build column list
    For Each pColumn In rDataTable.Columns
        pColumnList.Add(pColumn.ColumnName)
    Next

    'Filter by all columns
    pNewDataTable = rDataTable.DefaultView.ToTable(True, pColumnList.ToArray)

    rDataTable = rDataTable.Clone

    'Import rows into original table structure
    For Each pCurrentRowCopy In pNewDataTable.Rows
        rDataTable.ImportRow(pCurrentRowCopy)
    Next
End Sub

答案 9 :(得分:0)

为了区分所有数据表列,您可以轻松地检索字符串数组中的列名

public static DataTable RemoveDuplicateRows(this DataTable dataTable)
{
    List<string> columnNames = new List<string>();
    foreach (DataColumn col in dataTable.Columns)
    {
        columnNames.Add(col.ColumnName);
    }
    return dataTable.DefaultView.ToTable(true, columnNames.Select(c => c.ToString()).ToArray());
}

您会注意到,我想到了将其用作DataTable类的扩展

答案 10 :(得分:0)

我更喜欢这个,因为这比 DefaultView.ToTable 和 foreach 循环删除重复项更快。使用它,我们也可以对多列进行分组。

DataTable distinctDT = (from rows in dt.AsEnumerable() 
group rows by new { ColA = rows["ColA"], ColB = rows["ColB"]} into grp
select grp.First()).CopyToDataTable();