LINQ查询 - 数据聚合(Group Adjacent)

时间:2013-02-14 16:16:29

标签: c# linq grouping

我们来一个名为Cls的课程:

public class Cls
{
    public int SequenceNumber { get; set; }
    public int Value { get; set; }
}

现在,让我们用以下元素填充一些集合:

Sequence
Number      Value
========    =====
1           9
2           9
3           15
4           15
5           15
6           30
7           9

我需要做的是枚举序列号并检查下一个元素是否具有相同的值。如果是,则汇总值,因此,所需的输出如下:

Sequence    Sequence
Number      Number
From        To          Value
========    ========    =====
1           2           9
3           5           15
6           6           30
7           7           9

如何使用LINQ查询执行此操作?

7 个答案:

答案 0 :(得分:17)

您可以在修改后的版本中使用Linq的GroupBy,只有当这两个项目相邻时才会分组,那么它很容易:

var result = classes
    .GroupAdjacent(c => c.Value)
    .Select(g => new { 
        SequenceNumFrom = g.Min(c => c.SequenceNumber),
        SequenceNumTo = g.Max(c => c.SequenceNumber),  
        Value = g.Key
    });

foreach (var x in result)
    Console.WriteLine("SequenceNumFrom:{0} SequenceNumTo:{1} Value:{2}", x.SequenceNumFrom, x.SequenceNumTo, x.Value);

DEMO

结果:

SequenceNumFrom:1  SequenceNumTo:2  Value:9
SequenceNumFrom:3  SequenceNumTo:5  Value:15
SequenceNumFrom:6  SequenceNumTo:6  Value:30
SequenceNumFrom:7  SequenceNumTo:7  Value:9

这是对相邻项目进行分组的扩展方法:

public static IEnumerable<IGrouping<TKey, TSource>> GroupAdjacent<TSource, TKey>(
        this IEnumerable<TSource> source,
        Func<TSource, TKey> keySelector)
    {
        TKey last = default(TKey);
        bool haveLast = false;
        List<TSource> list = new List<TSource>();
        foreach (TSource s in source)
        {
            TKey k = keySelector(s);
            if (haveLast)
            {
                if (!k.Equals(last))
                {
                    yield return new GroupOfAdjacent<TSource, TKey>(list, last);
                    list = new List<TSource>();
                    list.Add(s);
                    last = k;
                }
                else
                {
                    list.Add(s);
                    last = k;
                }
            }
            else
            {
                list.Add(s);
                last = k;
                haveLast = true;
            }
        }
        if (haveLast)
            yield return new GroupOfAdjacent<TSource, TKey>(list, last);
    }
}

和使用的课程:

public class GroupOfAdjacent<TSource, TKey> : IEnumerable<TSource>, IGrouping<TKey, TSource>
{
    public TKey Key { get; set; }
    private List<TSource> GroupList { get; set; }
    System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator()
    {
        return ((System.Collections.Generic.IEnumerable<TSource>)this).GetEnumerator();
    }
    System.Collections.Generic.IEnumerator<TSource> System.Collections.Generic.IEnumerable<TSource>.GetEnumerator()
    {
        foreach (var s in GroupList)
            yield return s;
    }
    public GroupOfAdjacent(List<TSource> source, TKey key)
    {
        GroupList = source;
        Key = key;
    }
}

答案 1 :(得分:3)

您可以使用此linq查询

Demo

var values = (new[] { 9, 9, 15, 15, 15, 30, 9 }).Select((x, i) => new { x, i });

var query = from v in values
            let firstNonValue = values.Where(v2 => v2.i >= v.i && v2.x != v.x).FirstOrDefault()
            let grouping = firstNonValue == null ? int.MaxValue : firstNonValue.i
            group v by grouping into v
            select new
            {
              From = v.Min(y => y.i) + 1,
              To = v.Max(y => y.i) + 1,
              Value = v.Min(y => y.x)
            };

答案 2 :(得分:3)

MoreLinq提供开箱即用的此功能

它被称为GroupAdjacent,并在IEnumerable上实现为扩展方法:

  

根据指定的键选择器功能对序列的相邻元素进行分组。

enumerable.GroupAdjacent(e => e.Key)

如果您不想引入额外的Nuget "source" package,甚至只有binary Nuget package只包含该方法。

该方法返回IEnumerable<IGrouping<TKey, TValue>>,因此可以按GroupBy的输出相同的方式处理其输出。

答案 3 :(得分:2)

你可以这样做:

var all = new [] {
    new Cls(1, 9)
,   new Cls(2, 9)
,   new Cls(3, 15)
,   new Cls(4, 15)
,   new Cls(5, 15)
,   new Cls(6, 30)
,   new Cls(7, 9)
};
var f = all.First();
var res = all.Skip(1).Aggregate(
    new List<Run> {new Run {From = f.SequenceNumber, To = f.SequenceNumber, Value = f.Value} }
,   (p, v) => {
    if (v.Value == p.Last().Value) {
        p.Last().To = v.SequenceNumber;
    } else {
        p.Add(new Run {From = v.SequenceNumber, To = v.SequenceNumber, Value = v.Value});
    }
    return p;
});
foreach (var r in res) {
    Console.WriteLine("{0} - {1} : {2}", r.From, r.To, r.Value);
}

我们的想法是创造性地使用Aggregate:从包含单个Run的列表开始,检查我们在每个聚合阶段到目前为止所获得的列表的内容({{ lambda中的1}}语句。根据最后一个值,继续旧运行,或者开始一个新运行。

这是demo on ideone

答案 4 :(得分:2)

我能够通过创建自定义扩展方法来实现它。

static class Extensions {
  internal static IEnumerable<Tuple<int, int, int>> GroupAdj(this IEnumerable<Cls> enumerable) {
    Cls start = null;
    Cls end = null;
    int value = Int32.MinValue;

    foreach (Cls cls in enumerable) {
      if (start == null) {
        start = cls;
        end = cls;
        continue;
      }

      if (start.Value == cls.Value) {
        end = cls;
        continue;
      }

      yield return Tuple.Create(start.SequenceNumber, end.SequenceNumber, start.Value);
      start = cls;
      end = cls;
    }

    yield return Tuple.Create(start.SequenceNumber, end.SequenceNumber, start.Value);
  }
}

以下是实施:

static void Main() {
  List<Cls> items = new List<Cls> {
    new Cls { SequenceNumber = 1, Value = 9 },
    new Cls { SequenceNumber = 2, Value = 9 },
    new Cls { SequenceNumber = 3, Value = 15 },
    new Cls { SequenceNumber = 4, Value = 15 },
    new Cls { SequenceNumber = 5, Value = 15 },
    new Cls { SequenceNumber = 6, Value = 30 },
    new Cls { SequenceNumber = 7, Value = 9 }
  };

  Console.WriteLine("From  To    Value");
  Console.WriteLine("===== ===== =====");
  foreach (var item in items.OrderBy(i => i.SequenceNumber).GroupAdj()) {
    Console.WriteLine("{0,-5} {1,-5} {2,-5}", item.Item1, item.Item2, item.Item3);
  }
}

预期的产出:

From  To    Value
===== ===== =====
1     2     9
3     5     15
6     6     30
7     7     9

答案 5 :(得分:2)

这是一个没有任何辅助方法的实现:

var grp = 0;
var results =
from i
in
input.Zip(
    input.Skip(1).Concat(new [] {input.Last ()}),
    (n1, n2) => Tuple.Create(
        n1, (n2.Value == n1.Value) ? grp : grp++
    )
)
group i by i.Item2 into gp
select new {SequenceNumFrom = gp.Min(x => x.Item1.SequenceNumber),SequenceNumTo = gp.Max(x => x.Item1.SequenceNumber), Value = gp.Min(x => x.Item1.Value)};

这个想法是:

  • 跟踪您自己的分组指标,grp。
  • 将集合中的每个项目加入集合中的下一个项目(通过Skip(1)和Zip)。
  • 如果值匹配,则它们位于同一组中;否则,增加grp以指示下一组的开始。

答案 6 :(得分:1)

未经考验的黑暗魔法随之而来。在这种情况下,命令式版本似乎更容易。

IEnumerable<Cls> data = ...;
var query = data
    .GroupBy(x => x.Value)
    .Select(g => new
    {
        Value = g.Key,
        Sequences = g
            .OrderBy(x => x.SequenceNumber)
            .Select((x,i) => new
            {
                x.SequenceNumber,
                OffsetSequenceNumber = x.SequenceNumber - i
            })
            .GroupBy(x => x.OffsetSequenceNumber)
            .Select(g => g
                .Select(x => x.SequenceNumber)
                .OrderBy(x => x)
                .ToList())
            .ToList()
    })
    .SelectMany(x => x.Sequences
        .Select(s => new { First = s.First(), Last = s.Last(), x.Value }))
    .OrderBy(x => x.First)
    .ToList();