将连续日期组合到范围中

时间:2013-10-17 17:16:48

标签: c# linq c#-4.0

我有一个对象列表

public class sample
{
 public DateTime Date;
 public string content;
}

我希望能够创建新对象列表

public class sampleWithIntervals
{
 public DateTime startDate;
 public DateTime endDate;
 public string content;
}

应根据内容将样本对象分组为间隔。间隔可以仅包括原始样本列表中包含的那些日期。 我不知道如何在Linq中做到这一点。

示例数据:

{"10/1/2013", "x"}
{"10/2/2013", "x"}
{"10/2/2013", "y"}
{"10/3/2013", "x"}
{"10/3/2013", "y"}
{"10/10/2013", "x"}
{"10/11/2013", "x"}
{"10/15/2013", "y"}
{"10/16/2013", "y"}
{"10/20/2013", "y"}

This should give me 
{"10/1/2013","10/3/2013", "x"}
{"10/2/2013","10/3/2013", "y"}
{"10/10/2013","10/11/2013", "x"}
{"10/15/2013","10/16/2013", "y"}
{"10/20/2013","10/20/2013", "y"}

2 个答案:

答案 0 :(得分:8)

这是一种非Linq方式:

List<sampleWithIntervals> groups = new List<sampleWithIntervals>();  
sampleWithIntervals curGroup = null;

foreach(sample s in samples.OrderBy(sa => sa.content).ThenBy(sa => sa.Date))
{
    if(curGroup == null || // first group
        s.Date != curGroup.endDate.AddDays(1) ||
        s.content != curGroup.content   // new group
      ) 
    {
        curGroup = new sampleWithIntervals() {startDate = s.Date, endDate = s.Date, content = s.content};
        groups.Add(curGroup);
    }
    else
    {
        // add to current group
        curGroup.endDate = s.Date;
    }
}

你可以使用一个把日期减去索引的项目分组的技巧来对Linq这样做:

samples.OrderBy(s => s.content)   
       .ThenBy(s => s.Date)
       // select each item with its index
       .Select ((s, i) => new {sample = s, index = i})  
       // group by date miuns index to group consecutive items
       .GroupBy(si => new {date = si.sample.Date.AddDays(-si.index), content = si.sample.content})  
       // get the min, max, and content of each group
       .Select(g => new sampleWithIntervals() {
                        startDate = g.Min(s => s.sample.Date), 
                        endDate = g.Max(s => s.sample.Date), 
                        content = g.First().sample.content
                        })

答案 1 :(得分:0)

我有这个SplitBy扩展方法,您可以在其中指定用于拆分集合的分隔符谓词,就像string.Split一样。

public static IEnumerable<IEnumerable<T>> SplitBy<T>(this IEnumerable<T> source, 
                                                     Func<T, bool> delimiterPredicate,
                                                     bool includeEmptyEntries = false, 
                                                     bool includeSeparator = false)
{
    var l = new List<T>();
    foreach (var x in source)
    {
        if (!delimiterPredicate(x))
            l.Add(x);
        else
        {
            if (includeEmptyEntries || l.Count != 0)
            {
                if (includeSeparator)
                    l.Add(x);

                yield return l;
            }

            l = new List<T>();
        }
    }
    if (l.Count != 0 || includeEmptyEntries)
        yield return l;
}

因此,如果您可以指定连续的条纹分隔符,那么现在拆分很容易。为此,您可以使用相邻项目对集合和zip进行排序,因此现在两个结果列中日期的差异可以作为分隔符。

var ordered = samples.OrderBy(x => x.content).ThenBy(x => x.Date).ToArray();
var result = ordered.Zip(ordered.Skip(1).Append(new sample()), (start, end) => new { start, end })
                    .SplitBy(x => x.end.Date - x.start.Date != TimeSpan.FromDays(1), true, true)
                    .Select(x => x.Select(p => p.start).ToArray())
                    .Where(x => x.Any())
                    .Select(x => new sampleWithIntervals
                    {
                        content = x.First().content,
                        startDate = x.First().Date,
                        endDate = x.Last().Date
                    });

new sample()是一个用于正确获取Zip的虚拟实例。 Append方法是将项目附加到IEnumerable<>序列,它是这样的:

public static IEnumerable<T> Append<T>(this IEnumerable<T> source, params T[] items)
{
    return source.Concat(items);
}

注意:这不会保留初始订单。如果您想要原始订单,请最初选择索引并立即形成一个匿名类(Select((x, i) => new { x, i })),并在最后阶段根据索引进行排序,然后再选择合适的类型。