以下LINQ查询读取分隔文件并返回每个recordId的最新记录。问题是,最近的记录并不总是返回。我究竟做错了什么?我需要更改什么以确保始终返回最近的日期?有没有比使用.Max()更好的方法?
我还附上了一些示例数据,以便您查看问题。查看示例数据时,标有星号(*)的行是我想要返回的行(最近的日期)。在我看来,标有X的行是错误的,返回了。
如果同一记录ID出现多次(例如#162337)并且有多个日期,我想要返回一个记录最近日期的记录。
var recipients = File.ReadAllLines(path)
.Select (record => record.Split('|'))
.Select (tokens => new
{
FirstName = tokens[2],
LastName = tokens[4],
recordId = Convert.ToInt32(tokens[13]),
date = Convert.ToDateTime(tokens[17])
}
)
.GroupBy (m => m.recordId)
.OrderByDescending (m => m.Max (x => x.date ) )
.Select (m => m.First () )
.OrderBy (m => m.recordId )
.Dump();
FirstName LastName recordId date
fname lname 137308 2/15/1991 0:00
fname lname 138011 6/16/1983 0:00 *
fname lname 138011 11/9/1981 0:00 x
fname lname 158680 9/4/1986 0:00
fname lname 161775 4/23/1991 0:00
fname lname 162337 12/1/1998 0:00 *
fname lname 162337 12/1/1998 0:00 *
fname lname 162337 9/1/1994 0:00 x
fname lname 162337 9/1/1994 0:00 x
fname lname 163254 2/12/1969 0:00
fname lname 173816 9/26/1997 0:00
fname lname 178063 1/16/1980 0:00 *
fname lname 178063 3/3/1976 0:00 x
fname lname 180725 7/1/2007 0:00 *
fname lname 180725 1/14/1992 0:00 x
fname lname 181153 5/1/2001 0:00
答案 0 :(得分:5)
您按照每个组中的最大日期排序整个组序列。您需要做的是在每个单独的组中订购,以便只选择具有最大日期的项目。
var recipients = File.ReadAllLines(path)
.Select(record => record.Split('|'))
.Select(tokens => new
{
FirstName = tokens[2],
LastName = tokens[4],
recordId = Convert.ToInt32(tokens[13]),
date = Convert.ToDateTime(tokens[17])
})
.GroupBy(m => m.recordId,
(k, g) => g.OrderByDescending(m => m.date).First())
.OrderBy(m => m.recordId);
如果性能很重要且每个组可能包含很多项目,那么如果您使用Aggregate
来确定组中的最大记录而不是{{},则可能会看到略有改进1}} / OrderByDescending
组合:
First
答案 1 :(得分:2)
这条线是否可能:
.OrderByDescending (m => m.Max (x => x.date ) )
按照最长日期排序群组,而不是每个群组中的项目?
这个经过修整的代码段似乎会产生您正在寻找的结果(尽管您必须使用文件处理,显然)
List<Customer> Customers = new List<Customer>() {
new Customer(){ RecordId = 12, Birthday = new DateTime(1970, 1, 1)},
new Customer(){ RecordId = 12, Birthday = new DateTime(1982, 3, 22)},
new Customer(){ RecordId = 12, Birthday = new DateTime(1990, 1, 1)},
new Customer(){ RecordId = 14, Birthday = new DateTime(1960, 1, 1)},
new Customer(){ RecordId = 14, Birthday = new DateTime(1990, 5, 15)},
};
var groups = Customers.GroupBy(c => c.RecordId);
IEnumerable<Customer> itemsFromGroupWithMaxDate = groups.Select(g => g.OrderByDescending(c => c.Birthday).First());
foreach(Customer C in itemsFromGroupWithMaxDate)
Console.WriteLine(String.Format("{0} {1}", C.RecordId, C.Birthday));
或者更好:
IEnumerable<Customer> itemsFromGroupWithMaxDate = Customers.GroupBy(c => c.RecordId).Select(g => g.OrderByDescending(c => c.Birthday).First());
盲目地捅你的代码,我相信这可能会有效:
var recipients = File.ReadAllLines(path)
.Select (record => record.Split('|'))
.Select (tokens => new
{
FirstName = tokens[2],
LastName = tokens[4],
recordId = Convert.ToInt32(tokens[13]),
date = Convert.ToDateTime(tokens[17])
}
)
.GroupBy (m => m.recordId)
.Select(m => OrderByDescending(x => x.date).First())
.OrderBy (m => m.recordId )
.Dump();