为什么这个LINQ查询没有返回正确的日期?

时间:2011-03-16 14:44:27

标签: c# linq .net-4.0

以下LINQ查询读取分隔文件并返回每个recordId的最新记录。问题是,最近的记录并不总是返回。我究竟做错了什么?我需要更改什么以确保始终返回最近的日期?有没有比使用.Max()更好的方法?

我还附上了一些示例数据,以便您查看问题。查看示例数据时,标有星号(*)的行是我想要返回的行(最近的日期)。在我看来,标有X的行是错误的,返回了。

如果同一记录ID出现多次(例如#162337)并且有多个日期,我想要返回一个记录最近日期的记录。

var recipients = File.ReadAllLines(path)
    .Select (record => record.Split('|'))
    .Select (tokens => new 
        {
        FirstName = tokens[2],
        LastName = tokens[4],
        recordId = Convert.ToInt32(tokens[13]),
        date = Convert.ToDateTime(tokens[17])
        }
    )
    .GroupBy (m => m.recordId)
    .OrderByDescending (m => m.Max (x => x.date ) )
    .Select (m => m.First () )
    .OrderBy (m => m.recordId )

    .Dump();


FirstName   LastName    recordId    date    
fname   lname   137308  2/15/1991 0:00  
fname   lname   138011  6/16/1983 0:00  *
fname   lname   138011  11/9/1981 0:00  x
fname   lname   158680  9/4/1986 0:00   
fname   lname   161775  4/23/1991 0:00  
fname   lname   162337  12/1/1998 0:00  *
fname   lname   162337  12/1/1998 0:00  *
fname   lname   162337  9/1/1994 0:00   x
fname   lname   162337  9/1/1994 0:00   x
fname   lname   163254  2/12/1969 0:00  
fname   lname   173816  9/26/1997 0:00  
fname   lname   178063  1/16/1980 0:00  *
fname   lname   178063  3/3/1976 0:00   x
fname   lname   180725  7/1/2007 0:00   *
fname   lname   180725  1/14/1992 0:00  x
fname   lname   181153  5/1/2001 0:00   

2 个答案:

答案 0 :(得分:5)

您按照每个组中的最大日期排序整个组序列。您需要做的是在每个单独的组中订购,以便只选择具有最大日期的项目。

var recipients = File.ReadAllLines(path)
                     .Select(record => record.Split('|'))
                     .Select(tokens => new
                         {
                             FirstName = tokens[2],
                             LastName = tokens[4],
                             recordId = Convert.ToInt32(tokens[13]),
                             date = Convert.ToDateTime(tokens[17])
                         })
                     .GroupBy(m => m.recordId,
                              (k, g) => g.OrderByDescending(m => m.date).First())
                     .OrderBy(m => m.recordId);

如果性能很重要且每个组可能包含很多项目,那么如果您使用Aggregate来确定组中的最大记录而不是{{},则可能会看到略有改进1}} / OrderByDescending组合:

First

答案 1 :(得分:2)

这条线是否可能:

.OrderByDescending (m => m.Max (x => x.date ) )

按照最长日期排序群组,而不是每个群组中的项目?

这个经过修整的代码段似乎会产生您正在寻找的结果(尽管您必须使用文件处理,显然)

        List<Customer> Customers = new List<Customer>() {
            new Customer(){ RecordId = 12, Birthday = new DateTime(1970, 1, 1)},
            new Customer(){ RecordId = 12, Birthday = new DateTime(1982, 3, 22)},
            new Customer(){ RecordId = 12, Birthday = new DateTime(1990, 1, 1)},

            new Customer(){ RecordId = 14, Birthday = new DateTime(1960, 1, 1)},
            new Customer(){ RecordId = 14, Birthday = new DateTime(1990, 5, 15)},
        };

        var groups = Customers.GroupBy(c => c.RecordId);
        IEnumerable<Customer> itemsFromGroupWithMaxDate = groups.Select(g => g.OrderByDescending(c => c.Birthday).First());

        foreach(Customer C in itemsFromGroupWithMaxDate)
            Console.WriteLine(String.Format("{0} {1}", C.RecordId, C.Birthday));

或者更好:

IEnumerable<Customer> itemsFromGroupWithMaxDate = Customers.GroupBy(c => c.RecordId).Select(g => g.OrderByDescending(c => c.Birthday).First());

盲目地捅你的代码,我相信这可能会有效:

var recipients = File.ReadAllLines(path)
    .Select (record => record.Split('|'))
    .Select (tokens => new 
        {
        FirstName = tokens[2],
        LastName = tokens[4],
        recordId = Convert.ToInt32(tokens[13]),
        date = Convert.ToDateTime(tokens[17])
        }
    )
    .GroupBy (m => m.recordId)
    .Select(m => OrderByDescending(x => x.date).First())
    .OrderBy (m => m.recordId )

    .Dump();