我正在研究一个应从列表对象中进行重复数据删除(删除重复项)的函数。这是要求:
如果符合以下条件,则贸易线被视为重复项:
如果找到了东西,则仅选择那些具有
的东西我在实现最后一个要点时遇到麻烦。这是我的代码:
public IEnumerable<Tradeline> DedupeTradeline(IEnumerable<Tradeline> tradelines)
{
//split tradeline into manual and non-manual
var tradelineDictionary = tradelines.GroupBy(x => x.Source == "MAN").ToDictionary(x => x.Key, x => x.ToList());
//create list of non-manual tradeline for dedupe logic
var nonManualTradelines = tradelineDictionary.Where(x => x.Key == false).Select(x => x.Value).FirstOrDefault();
var manualTradelines = tradelineDictionary.Where(x => x.Key).Select(x => x.Value).FirstOrDefault();
//check if same reported date is present for dedupe tradelines
var duplicate = nonManualTradelines?.GroupBy(x => new
{
x.ReportedDate,
x.Account,
x.AcctType,
x.Date
}).Any(g => g.Count() > 1);
IEnumerable<Tradeline> dedupe;
if (duplicate != null && (bool) !duplicate)
{
//logic for dedupe tradeline if no same reported date
dedupe = nonManualTradelines.GroupBy(x => new
{
x.Account,
x.AcctType,
x.Date
})
//in case of duplicate tradelines select one with the latest date reported
.Select(x => x.OrderByDescending(o => o.ReportedDate).First());
}
else
{
//logic for dedupe tradeline if same reported date
dedupe = nonManualTradelines?.GroupBy(x => new
{
x.ReportedDate,
x.Account,
x.AcctType,
x.Date
})
.Select();
// Stuck here not sure what to do
}
//append manual tradeline to the output of dedupe tradelines
var response = manualTradelines != null ? (dedupe).Union(manualTradelines) : dedupe;
return response;
}
Tradeline类:
public class Tradeline
{
public string Account { get; set; }
public string AcctType { get; set; }
public string Late30 { get; set; }
public string Late60 { get; set; }
public string Late90 { get; set; }
public string Date { get; set; }
public string ReportedDate { get; set; }
public string Source { get; set; }
}
答案 0 :(得分:1)
您可以按照最大Late
x值的降序排列。我将Dictionary
的特殊用法替换为两个类别的简单有效的分离。
public static class ObjectExt {
public static int ToInt<T>(this T obj) => Convert.ToInt32(obj);
}
public IEnumerable<Tradeline> DedupeTradeline(IEnumerable<Tradeline> tradelines) {
//split tradeline into manual and non-manual
var nonManualTradelines = new List<Tradeline>();
var manualTradelines = new List<Tradeline>();
foreach (var t in tradelines) {
if (t.Source == "MAN")
manualTradelines.Add(t);
else
nonManualTradelines.Add(t);
}
IEnumerable<Tradeline> dedupe = nonManualTradelines.GroupBy(t => new {
t.Account,
t.AcctType,
t.Date
})
//in case of duplicate tradelines select one with the latest date reported
.Select(tg => tg.OrderByDescending(t => t.ReportedDate).ThenByDescending(t => Math.Max(t.Late90.ToInt(), Math.Max(t.Late60.ToInt(), t.Late30.ToInt()))).First());
//append manual tradeline to the output of dedupe tradelines
return dedupe.Union(manualTradelines);
}