使用linq

时间:2019-06-13 19:45:16

标签: c# linq

我正在研究一个应从列表对象中进行重复数据删除(删除重复项)的函数。这是要求:

如果符合以下条件,则贸易线被视为重复项:

  • 相同的帐号,帐户类型,日期,并且不是手动的

如果找到了东西,则仅选择那些具有

的东西
  • 最新报告日期
  • 如果报告的日期相同,则比较(30,60,90)字段,并选择在上述三个属性中的任何一个中具有较高价值的交易行

我在实现最后一个要点时遇到麻烦。这是我的代码:

public IEnumerable<Tradeline> DedupeTradeline(IEnumerable<Tradeline> tradelines)
{
    //split tradeline into manual and non-manual    
    var tradelineDictionary = tradelines.GroupBy(x => x.Source == "MAN").ToDictionary(x => x.Key, x => x.ToList());
    //create list of non-manual tradeline for dedupe logic
    var nonManualTradelines = tradelineDictionary.Where(x => x.Key == false).Select(x => x.Value).FirstOrDefault();
    var manualTradelines = tradelineDictionary.Where(x => x.Key).Select(x => x.Value).FirstOrDefault();

    //check if same reported date is present for dedupe tradelines
    var duplicate = nonManualTradelines?.GroupBy(x => new
    {
        x.ReportedDate,
        x.Account,
        x.AcctType,
        x.Date
    }).Any(g => g.Count() > 1);

    IEnumerable<Tradeline> dedupe;
    if (duplicate != null && (bool) !duplicate)
    {
        //logic for dedupe tradeline if no same reported date
        dedupe = nonManualTradelines.GroupBy(x => new
            {
                x.Account,
                x.AcctType,
                x.Date
            })
            //in case of duplicate tradelines select one with the latest date reported
            .Select(x => x.OrderByDescending(o => o.ReportedDate).First());
    }
    else
    {
        //logic for dedupe tradeline if same reported date
        dedupe = nonManualTradelines?.GroupBy(x => new
            {
                x.ReportedDate,
                x.Account,
                x.AcctType,
                x.Date
            })
            .Select(); 
            // Stuck here not sure what to do
    }

    //append manual tradeline to the output of dedupe tradelines
    var response = manualTradelines != null ? (dedupe).Union(manualTradelines) : dedupe;

    return response;
}

Tradeline类:

public class Tradeline
{
    public string Account { get; set; }
    public string AcctType { get; set; }
    public string Late30 { get; set; }
    public string Late60 { get; set; }
    public string Late90 { get; set; }
    public string Date { get; set; }
    public string ReportedDate { get; set; }
    public string Source { get; set; }
}

1 个答案:

答案 0 :(得分:1)

您可以按照最大Late x值的降序排列。我将Dictionary的特殊用法替换为两个类别的简单有效的分离。

public static class ObjectExt {
    public static int ToInt<T>(this T obj) => Convert.ToInt32(obj);    
}

public IEnumerable<Tradeline> DedupeTradeline(IEnumerable<Tradeline> tradelines) {
    //split tradeline into manual and non-manual    
    var nonManualTradelines = new List<Tradeline>();
    var manualTradelines = new List<Tradeline>();
    foreach (var t in tradelines) {
        if (t.Source == "MAN")
            manualTradelines.Add(t);
        else
            nonManualTradelines.Add(t);
    }

    IEnumerable<Tradeline> dedupe = nonManualTradelines.GroupBy(t => new {
                t.Account,
                t.AcctType,
                t.Date
            })
            //in case of duplicate tradelines select one with the latest date reported
            .Select(tg => tg.OrderByDescending(t => t.ReportedDate).ThenByDescending(t => Math.Max(t.Late90.ToInt(), Math.Max(t.Late60.ToInt(), t.Late30.ToInt()))).First());

    //append manual tradeline to the output of dedupe tradelines
    return dedupe.Union(manualTradelines);
}