LINQ Group连续时间

时间:2011-11-11 21:16:44

标签: c# linq linq-to-objects

假设我有一个看起来像这样的简单结构:

public class Range
{
    public DateTime Start { get; set; }
    public DateTime End { get; set; }

    public Range(DateTime start, DateTime end)
    {
        this.Start = start;
        this.End = end;
    }
}

我创建了一个这样的集合:

var dr1 = new Range(new DateTime(2011, 11, 1, 12, 0, 0), 
    new DateTime(2011, 11, 1, 13, 0, 0));
var dr2 = new Range(new DateTime(2011, 11, 1, 13, 0, 0), 
    new DateTime(2011, 11, 1, 14, 0, 0));
var dr3 = new Range(new DateTime(2011, 11, 1, 14, 0, 0), 
    new DateTime(2011, 11, 1, 15, 0, 0));
var dr4 = new Range(new DateTime(2011, 11, 1, 16, 0, 0), 
    new DateTime(2011, 11, 1, 17, 0, 0));

var ranges = new List<Range>() { dr1, dr2, dr3, dr4 };

我想要做的是将它们连续的范围分组 - 即如果前一个范围的结束值与下一个范围的开始相同,则它们是连续的。

我们可以假设Range值中没有碰撞/重复或重叠。

在发布的示例中,我最终会得到两个小组:

2011-11-1 12:00:00 - 2011-11-1 15:00:00

2011-11-1 16:00:00 - 2011-11-1 17:00:00

为此提出迭代解决方案相当容易。但是有没有一些LINQ魔法可以用来在漂亮的单行中获得它?

7 个答案:

答案 0 :(得分:9)

您最好的选择是使用yield和扩展方法:

static IEnumerable<Range> GroupContinuous(this IEnumerable<Range> ranges)
{
    // Validate parameters.

    // Can order by start date, no overlaps, no collisions
    ranges = ranges.OrderBy(r => r.Start);

    // Get the enumerator.
    using (IEnumerator<Range> enumerator = ranges.GetEnumerator();
    {
        // Move to the first item, if nothing, break.
        if (!enumerator.MoveNext()) yield break;

        // Set the previous range.
        Range previous = enumerator.Current;

        // Cycle while there are more items.
        while (enumerator.MoveNext())
        {
            // Get the current item.
            Range current = enumerator.Current;

            // If the start date is equal to the end date
            // then merge with the previous and continue.
            if (current.Start == previous.End)
            {
                // Merge.
                previous = new Range(previous.Start, current.End);

                // Continue.
                continue;
            }

            // Yield the previous item.
            yield return previous;

            // The previous item is the current item.
            previous = current;
        }

        // Yield the previous item.
        yield return previous;
    }
}

当然,对OrderBy的调用将导致ranges序列的完整迭代,但是没有避免这种情况。订购后,您可以在退回之前防止必须实现结果;如果条件决定,你只需yield结果。

但是,如果您知道序列是有序的,那么您根本不必调用OrderBy,并且在遍历列表时可以yield项,并在不同的折叠时中断{ {1}}个实例。

最终,如果序列是无序的,那么您有两个选择:

  • 对列表进行排序然后处理(记住,Range也是延迟的,但是必须使用一次完整的迭代才能对序列进行排序),当你有一个项目时,使用OrderBy返回项目处理
  • 立即处理整个序列并作为整个物化序列返回

答案 1 :(得分:5)

casperOne's extension method的通用版本,原样使用:

var items = new[]
    {
        // Range 1
        new { A = 0, B = 1 },
        new { A = 1, B = 2 },
        new { A = 2, B = 3 },
        new { A = 3, B = 4 },
        // Range 2
        new { A = 5, B = 6 },
        new { A = 6, B = 7 },
        new { A = 7, B = 8 },
        new { A = 8, B = 9 },
    };

var ranges = items.ContinousRange(
    x => x.A,
    x => x.B,
    (lower, upper) => new { A = lower, B = upper });

foreach(var range in ranges)
{
    Console.WriteLine("{0} - {1}", range.A, range.B);
}

扩展方法的实施

    /// <summary>
    /// Calculates continues ranges based on individual elements lower and upper selections. Cannot compensate for overlapping.
    /// </summary>
    /// <typeparam name="T">The type containing a range</typeparam>
    /// <typeparam name="T1">The type of range values</typeparam>
    /// <param name="source">The ranges to be combined</param>
    /// <param name="lowerSelector">The range's lower bound</param>
    /// <param name="upperSelector">The range's upper bound</param>
    /// <param name="factory">A factory to create a new range</param>
    /// <returns>An enumeration of continuous ranges</returns>
    public static IEnumerable<T> ContinousRange<T, T1>(this IEnumerable<T> source, Func<T, T1> lowerSelector, Func<T, T1> upperSelector, Func<T1, T1, T> factory)
    {
        //Validate parameters

        // Can order by start date, no overlaps, no collisions
        source = source.OrderBy(lowerSelector);

        // Get enumerator
        using(var enumerator = source.GetEnumerator())
        {
            // Move to the first item, if nothing, break.
            if (!enumerator.MoveNext()) yield break;

            // Set the previous range.
            var previous = enumerator.Current;

            // Cycle while there are more items
            while(enumerator.MoveNext())
            {
                // Get the current item.
                var current = enumerator.Current;

                // If the start date is equal to the end date
                // then merge with the previoud and continue
                if (lowerSelector(current).Equals(upperSelector(previous)))
                {
                    // Merge
                    previous = factory(lowerSelector(previous), upperSelector(current));

                    // Continue
                    continue;
                }

                // Yield the previous item.
                yield return previous;

                // The previous item is the current item.
                previous = current;
            }

            // Yield the previous item.
            yield return previous;
        }
    }

答案 2 :(得分:3)

您可以使用Aggregate()方法和lambda将它们组合在一起。正如你所说的那样,假设没有重复或重叠:

// build your set of continuous ranges for results
List<Range> continuousRanges = new List<Range>();

ranges.Aggregate(continuousRanges, (con, next) => {
{
    // pull last item (or default if none) - O(1) for List<T>
    var last = continuousRanges.FirstOrDefault(r => r.End == next.Start);

    if (last != null)
        last.End = next.End;
    else
        con.Add(next);

    return con;
});   

现在,如果您知道范围是有序的,那么您可以随时将下一个与我们处理的最后一个进行比较,如下所示:

// build your set of continuous ranges for results
List<Range> continuousRanges = new List<Range>();

ranges.Aggregate(continuousRanges, (con, next) => {
{
    // pull last item (or default if none) - O(1) for List<T>
    var last = continuousRanges.LastOrDefault();

    if (last != null && last.End == next.Start)
        last.End = next.End;
    else
        con.Add(next);

    return con;
});   

答案 3 :(得分:2)

这是另一个LINQ解决方案。它通过一个查询找到每个连续范围的开始,每个连续范围的结束与另一个查询,然后通过这些对来构建新范围。

var starts = ranges.Where((r, i) => i == 0 || r.Start != ranges[i - 1].End);
var ends = ranges.Where((r, i) => i == ranges.Count - 1 || r.End != ranges[i + 1].Start);
var result = starts.Zip(ends, (s, e) => new Range(s.Start, e.End));

它可以重写为单行,但单独版本更清晰,更易于维护:

var result = ranges.Where((r, i) => i == 0 || r.Start != ranges[i - 1].End).Zip(ranges.Where((r, i) => i == ranges.Count - 1 || r.End != ranges[i + 1].Start), (s, e) => new Range(s.Start, e.End));

答案 4 :(得分:1)

以下是有效的,但实际上是滥用LINQ:

// Dummy at the end to get the last continues range
ranges.Add(new Range(default(DateTime), default(DateTime)));
// The previous element in the list
Range previous = ranges.FirstOrDefault();
// The start element of the current continuous range
Range start = previous;
ranges.Skip(1).Select(x => {var result = new {current = x, previous = previous};
                            previous = x; return result;})
              .Where(x => x.previous.End != x.current.Start)
              .Select(x => { var result = new Range(start.Start, x.previous.End);
                             start = x.current; return result; });

代码执行以下操作:

首先选择:

  1. 将当前值和当前先前值保存在新匿名类型
  2. 的实例中
  3. 将上一个值设置为下一次迭代的当前值
  4. 其中:仅选择标记新连续范围开始的那些元素

    第二次选择:

    1. 创建一个Range对象,其中包含已保存起始值的开始日期和上一项的结束值。
    2. 将起始值设置为当前项目,因为它标记了新连续范围的开始。
    3. 返回(1)
    4. 中创建的范围

      请注意:
      我会坚持使用迭代解决方案,因为上面的代码是不可读不可维护,而且我花了明显更多的时间而不仅仅是输入循环和如果......

答案 5 :(得分:1)

以下代码在查询理解语法中执行。

public static List<Range> Group(List<Range> dates){
    if(dates.Any()){
        var previous = dates.FirstOrDefault();
        previous = new Range(previous.Start,previous.Start);
        var last = dates.Last();
        var result = (from dt in dates
                     let isDone = dt.Start != previous.End
                     let prev = isDone || last == dt ? 
                                                previous :  
                                               (previous = new Range(previous.Start,dt.End))
                     where isDone || last == dt
                     let t = (previous = dt)
                     select prev).ToList();
        if(result.Last().End != last.End)
           result.Add(last);
        return result;
    }
    return Enumerable.Empty<Range>().ToList();
}

我认为我不会在生产代码中实际执行此操作,因为我觉得它违反了最少惊喜的规则。 linq语句通常没有副作用,这是因为副作用。但是我认为值得发帖表明确实可以使用O(n)中的查询理解语法来解决

答案 6 :(得分:0)

        var ranges = new List<Range>() { dr1, dr2, dr3, dr4 };
        var starts = ranges.Select(p => p.Start);
        var ends = ranges.Select(p => p.End);
        var discreet = starts.Union(ends).Except(starts.Intersect(ends)).OrderBy(p => p).ToList();
        List<Range> result = new List<Range>();
        for (int i = 0; i < discreet.Count; i = i + 2)
        {
            result.Add(new Range(discreet[i], discreet[i + 1]));
        }
        return result;