如何选择一个记录而不是另一个?

时间:2013-08-02 16:16:46

标签: c# linq duplicate-removal duplicate-data

我有一个重复的列表。

Row#    Lineid  ItemDescItemId  RoadTax VehicleId   Amount
1   122317  None    -1  26.63   -78603  300
2   122317  None    -2  17.75   -78603  200
3   122317  None    -1  22.19   -78602  250
4   122317  Deli    -2  17.75   -78603  200

在这种情况下,第2行是第4行的副本,因为LineId,RoadTax,Amount和VehicleId匹配。 但是,我想保留一行项目描述并删除第2行。所以我的输出列表如下所示:

Row#    Lineid  ItemDesc ItemId RoadTax VehicleId   Amount
1   122317  None    -1  26.63   -78603  300
3   122317  None    -1  22.19   -78602  250
4   122317  Deli    -2  17.75   -78603  200

我根据MSDN上的示例编写了一个IEqualityComparer类。该课程如下:

  public class RoadTaxComparer : IEqualityComparer<RoadTaxDto>
        {
            // Items are equal if ItemId / VehicleId / RoadTax are equal.
            public bool Equals(RoadTaxDto x, RoadTaxDto y)
            {

                //Check whether the compared objects reference the same data. 
                if (Object.ReferenceEquals(x, y)) return true;

                //Check whether any of the compared objects is null. 
                if (Object.ReferenceEquals(x, null) || Object.ReferenceEquals(y, null))
                    return false;

                //Check whether the products' properties are equal. 
                return x.VehicleId == y.VehicleId && x.ItemId == y.ItemId && x.RoadTax == y.RoadTax && x.Amount == y.Amount;
            }

            // If Equals() returns true for a pair of objects  
            // then GetHashCode() must return the same value for these objects. 

            public int GetHashCode(RoadTaxDto roadTaxDto)
            {
                //Check whether the object is null 
                if (Object.ReferenceEquals(roadTaxDto, null)) return 0;

                //Get hash code for the VehicleId. 
                int hashVehicleId = roadTaxDto.VehicleId.GetHashCode();

                //Get hash code for the ItemId field. 
                int hashCodeItemId = roadTaxDto.ItemId.GetHashCode();

                //Calculate the hash code for the QuoteTaxDto. 
                return hashVehicleId ^ hashCodeItemId;
            }

        }

RoadTaxDto结构如下所示:

class RoadTaxDto
{
public int LineId {get;set}
public string ItemDesc {get;set;}
public int VehicleId {get;set;}
public decimal RoadTax {get;set;}
public int VehicleId {get;set;}
public decimal Amount {get;set;}
}

我使用以下命令来消除重复项。

List<RoadTaxDto> mergedList = RoadTaxes.Union(RoadTaxes, new RoadTaxComparer()).ToList();

当我在其上运行比较器时,我无法保证第2行被删除。那么如何确保如果记录有重复,那么“无”的记录将始终从列表中删除。

2 个答案:

答案 0 :(得分:1)

我会将GetHashCode()移动到RoadTaxDto然后执行此操作:

foreach (var g in list.GroupBy(i => i.GetHashCode()))
    list2.Add(
        g.FirstOrDefault(i => i.ItemDesc != "None") ?? 
        g.First());

答案 1 :(得分:0)

纯粹的“SQLish”方法对你不起作用?

这样的事情:

var list = new [] { 
    new RoadTaxDto {LineId=122317,ItemDesc="None", ItemId=-1,RoadTax =26.63M  , VehicleId=-78603  ,Amount=300},
    new RoadTaxDto {LineId=122317,ItemDesc="None", ItemId=-2,RoadTax =17.75M  , VehicleId=-78603  ,Amount=200},
    new RoadTaxDto {LineId=122317,ItemDesc="None", ItemId=-1,RoadTax =22.19M  , VehicleId=-78602  ,Amount=250},
    new RoadTaxDto {LineId=122317,ItemDesc="Deli", ItemId=-2,RoadTax =17.75M , VehicleId=-78603  ,Amount=200}
};

var query = (from c in list join x in list 
    on new { c.LineId, c.ItemId , c.VehicleId, c.Amount,c.RoadTax} 
equals new {x.LineId, x.ItemId, x.VehicleId,x.Amount,x.RoadTax}
select new RoadTaxDto {
   LineId = c.LineId,
   ItemDesc = x.ItemDesc!="None"? x.ItemDesc:c.ItemDesc,
   VehicleId=c.VehicleId,
   Amount=c.Amount,
   RoadTax=c.RoadTax,
   ItemId=c.ItemId
}
).GroupBy(x => new { x.LineId, x.RoadTax, x.Amount, x.VehicleId} )
 .Select(grp => grp.Last());

打印:

LineId  ItemDesc    VehicleId   ItemId  RoadTax Amount
122317  None        -78603      -1      26.63   300
122317  Deli        -78603      -2      17.75   200
122317  None        -78602      -1      22.19   250