LINQ按时间段聚合和分组

时间:2012-01-13 19:40:37

标签: c# linq aggregate-functions

我试图了解如何使用LINQ按时间间隔对数据进行分组;然后理想地聚合每个小组。

查找具有明确日期范围的众多示例,我正在尝试按时段分组,例如5分钟,1小时,1天。

例如,我有一个用Date:

包装DateTime的类
public class Sample
{
     public DateTime timestamp;
     public double value;
}

这些观察结果包含在List集合中的一系列文章中:

List<Sample> series;

所以,按小时计算时间和按平均值计算总价值,我正在尝试做类似的事情:

var grouped = from s in series
              group s by new TimeSpan(1, 0, 0) into g
              select new { timestamp = g.Key, value = g.Average(s => s.value };

这基本上是有缺陷的,因为它将TimeSpan本身分组。我无法理解如何在查询中使用TimeSpan(或任何表示间隔的数据类型)。

9 个答案:

答案 0 :(得分:44)

您可以将时间戳四舍五入到下一个边界(即过去最接近最近的5分钟边界)并将其用作分组:

var groups = series.GroupBy(x =>
{
    var stamp = x.timestamp;
    stamp = stamp.AddMinutes(-(stamp.Minute % 5));
    stamp = stamp.AddMilliseconds(-stamp.Millisecond - 1000 * stamp.Second);
    return stamp;
})
.Select(g => new { TimeStamp = g.Key, Value = g.Average(s => s.value) })
.ToList();

以上通过在分组中使用修改的时间戳来实现,该分组将分钟设置为前5分钟边界并删除秒和毫秒。当然,同样的方法可以用于其他时间段,即小时和天。

修改

基于这个组成的样本输入:

var series = new List<Sample>();
series.Add(new Sample() { timestamp = DateTime.Now.AddMinutes(3) });
series.Add(new Sample() { timestamp = DateTime.Now.AddMinutes(4) });
series.Add(new Sample() { timestamp = DateTime.Now.AddMinutes(5) });
series.Add(new Sample() { timestamp = DateTime.Now.AddMinutes(6) });
series.Add(new Sample() { timestamp = DateTime.Now.AddMinutes(7) });
series.Add(new Sample() { timestamp = DateTime.Now.AddMinutes(15) });

为我制作了3组,其中一组分组时间戳为3:05,一组为3:10,另一组为下午3:20(您的结果可能因当前时间而异)。

答案 1 :(得分:11)

您需要一个对时间戳进行舍入的函数。类似的东西:

 var grouped = from s in series
          group s by new DateTime(s.timestamp.Year, s.timestamp.Month,  
                s.timestamp.Day, s.timestamp.Hour, 0, 0) into g
          select new { timestamp = g.Key, value = g.Average(s => s.value };

每小时箱子。请注意,结果中的时间戳现在将是DateTime,而不是TimeSpan。

答案 2 :(得分:6)

我在这场比赛上的比赛已经很晚了,但是我在寻找其他东西的过程中遇到了这个问题,我认为我有更好的方法。

series.GroupBy (s => s.timestamp.Ticks / TimeSpan.FromHours(1).Ticks)
        .Select (s => new {
            series = s
            ,timestamp = s.First ().timestamp
            ,average = s.Average (x => x.value )
        }).Dump();

这是一个示例linqpad程序,您可以验证和测试

void Main()
{
    List<Sample> series = new List<Sample>();

    Random random = new Random(DateTime.Now.Millisecond);
    for (DateTime i = DateTime.Now.AddDays(-5); i < DateTime.Now; i += TimeSpan.FromMinutes(1))
    {
        series.Add(new UserQuery.Sample(){ timestamp = i, value = random.NextDouble() * 100 });
    }
    //series.Dump();
    series.GroupBy (s => s.timestamp.Ticks / TimeSpan.FromHours(1).Ticks)
        .Select (s => new {
            series = s
            ,timestamp = s.First ().timestamp
            ,average = s.Average (x => x.value )
        }).Dump();
}

// Define other methods and classes here
public class Sample
{
     public DateTime timestamp;
     public double value;
}

答案 3 :(得分:2)

对于按小时分组,您需要按时间戳的小时部分分组,这可以这样做:

var groups = from s in series
  let groupKey = new DateTime(s.timestamp.Year, s.timestamp.Month, s.timestamp.Day, s.timestamp.Hour, 0, 0)
  group s by groupKey into g select new
                                      {
                                        TimeStamp = g.Key,
                                        Value = g.Average(a=>a.value)
                                      };

答案 4 :(得分:1)

我建议使用新的DateTime()避免任何亚毫秒差异的问题

var versionsGroupedByRoundedTimeAndAuthor = db.Versions.GroupBy(g => 
new
{
                UserID = g.Author.ID,
                Time = RoundUp(g.Timestamp, TimeSpan.FromMinutes(2))
});

  private DateTime RoundUp(DateTime dt, TimeSpan d)
        {
            return new DateTime(((dt.Ticks + d.Ticks - 1) / d.Ticks) * d.Ticks);
        }

N.B。我在这里按Author.ID进行分组以及舍入的TimeStamp。

RoundUp函数取自@dtb answer here https://stackoverflow.com/a/7029464/661584

了解平等到毫秒并不总是意味着在这里平等Why does this unit test fail when testing DateTime equality?

答案 5 :(得分:0)

即使我真的很晚,这是我的2美分:

我希望以5分钟的间隔向下和向上舍入()时间值:

10:31 --> 10:30
10:33 --> 10:35
10:36 --> 10:35

这可以通过转换为TimeSpan.Tick并转换回DateTime并使用Math.Round()来实现:

public DateTime GetShiftedTimeStamp(DateTime timeStamp, int minutes)
{
    return
        new DateTime(
            Convert.ToInt64(
                Math.Round(timeStamp.Ticks / (decimal)TimeSpan.FromMinutes(minutes).Ticks, 0, MidpointRounding.AwayFromZero)
                    * TimeSpan.FromMinutes(minutes).Ticks));
}

如上所示,shiftedTimeStamp可用于linq分组。

答案 6 :(得分:0)

我对BrokenGlass的回答进行了改进,使其更具通用性并添加了保护措施。根据他目前的答案,如果你选择9的间隔,它将不会做你期望的。任何数字60都不能被整除。对于这个例子,我使用9并从午夜(0:00)开始。

  • 从0:00到0:08.999的所有内容都将按照您的预期放入0:00组中。它会一直这样做,直到你进入从0:54开始的分组。
  • 在0:54,它只会将事物从0:54分组到0:59.999而不是上升到01:03.999。

对我来说,这是一个大问题。

我不知道如何解决这个问题,但你可以添加保护措施 变化:

  1. 60%[interval]等于0的任何一分钟都是可接受的间隔。下面的if语句可以保护这一点。
  2. 小时间隔也有效。

            double minIntervalAsDouble = Convert.ToDouble(minInterval);
            if (minIntervalAsDouble <= 0)
            {
                string message = "minInterval must be a positive number, exiting";
                Log.getInstance().Info(message);
                throw new Exception(message);
            }
            else if (minIntervalAsDouble < 60.0 && 60.0 % minIntervalAsDouble != 0)
            {
                string message = "60 must be divisible by minInterval...exiting";
                Log.getInstance().Info(message);
                throw new Exception(message);
            }
            else if (minIntervalAsDouble >= 60.0 && (24.0 % (minIntervalAsDouble / 60.0)) != 0 && (24.0 % (minIntervalAsDouble / 60.0) != 24.0))
            {
                //hour part must be divisible...
                string message = "If minInterval is greater than 60, 24 must be divisible by minInterval/60 (hour value)...exiting";
                Log.getInstance().Info(message);
                throw new Exception(message);
            }
            var groups = datas.GroupBy(x =>
            {
                if (minInterval < 60)
                {
                    var stamp = x.Created;
                    stamp = stamp.AddMinutes(-(stamp.Minute % minInterval));
                    stamp = stamp.AddMilliseconds(-stamp.Millisecond);
                    stamp = stamp.AddSeconds(-stamp.Second);
                    return stamp;
                }
                else
                {
                    var stamp = x.Created;
                    int hourValue = minInterval / 60;
                    stamp = stamp.AddHours(-(stamp.Hour % hourValue));
                    stamp = stamp.AddMilliseconds(-stamp.Millisecond);
                    stamp = stamp.AddSeconds(-stamp.Second);
                    stamp = stamp.AddMinutes(-stamp.Minute);
                    return stamp;
                }
            }).Select(o => new
            {
                o.Key,
                min = o.Min(f=>f.Created),
                max = o.Max(f=>f.Created),
                o
            }).ToList();
    
  3. 在select语句中添加您想要的任何内容!我输入了min / max,因为它更容易测试。

答案 7 :(得分:0)

我知道这并没有直接回答这个问题,但我在谷歌上搜索一个非常类似的解决方案来汇总股票/加密货币的蜡烛数据,从较小的分钟到较高的分钟期间(5,10) ,15,30)。由于聚合时段的时间戳不一致,因此您无法一次从当前分钟返回X。您还必须注意列表的开头和结尾有足够的数据来填充较大时段的完整烛台。鉴于此,我提出的解决方案如下。 (它假设较小时期的蜡烛,如rawPeriod所示,按升序时间戳排序。)

public class Candle
{
    public long Id { get; set; }
    public Period Period { get; set; }
    public DateTime Timestamp { get; set; }
    public double High { get; set; }
    public double Low { get; set; }
    public double Open { get; set; }
    public double Close { get; set; }
    public double BuyVolume { get; set; }
    public double SellVolume { get; set; }
}

public enum Period
{
    Minute = 1,
    FiveMinutes = 5,
    QuarterOfAnHour = 15,
    HalfAnHour = 30
}

    private List<Candle> AggregateCandlesIntoRequestedTimePeriod(Period rawPeriod, Period requestedPeriod, List<Candle> candles)
    {
        if (rawPeriod != requestedPeriod)
        {
            int rawPeriodDivisor = (int) requestedPeriod;
            candles = candles
                        .GroupBy(g => new { TimeBoundary = new DateTime(g.Timestamp.Year, g.Timestamp.Month, g.Timestamp.Day, g.Timestamp.Hour, (g.Timestamp.Minute / rawPeriodDivisor) * rawPeriodDivisor , 0) })
                        .Where(g => g.Count() == rawPeriodDivisor )
                        .Select(s => new Candle
                        {
                            Period = requestedPeriod,
                            Timestamp = s.Key.TimeBoundary,
                            High = s.Max(z => z.High),
                            Low = s.Min(z => z.Low),
                            Open = s.First().Open,
                            Close = s.Last().Close,
                            BuyVolume = s.Sum(z => z.BuyVolume),
                            SellVolume = s.Sum(z => z.SellVolume),
                        })
                        .OrderBy(o => o.Timestamp)
                        .ToList();
        }

        return candles;
    }

答案 8 :(得分:0)

广义解决方案:

    static IEnumerable<IGrouping<DateRange, T>> GroupBy<T>(this IOrderedEnumerable<T> enumerable, TimeSpan timeSpan, Func<T, DateTime> predicate)
    {
        Grouping<T> grouping = null;
        foreach (var (a, dt) in from b in enumerable select (b, predicate.Invoke(b)))
        {
            if (grouping == null || dt > grouping.Key.End)
                yield return grouping = new Grouping<T>(new DateRange(dt, dt + timeSpan), a);
            else
                grouping.Add(a);
        }
    }

    class Grouping<T> : IGrouping<DateRange, T>
    {

        readonly List<T> elements = new List<T>();

        public DateRange Key { get; }

        public Grouping(DateRange key) => Key = key;

        public Grouping(DateRange key, T element) : this(key) => Add(element);

        public void Add(T element) => elements.Add(element);

        public IEnumerator<T> GetEnumerator()=> this.elements.GetEnumerator();

        IEnumerator IEnumerable.GetEnumerator() => GetEnumerator();
    }

    class DateRange
    {
    
        public DateRange(DateTime start, DateTime end)
        {
            this.Start = start;
            this.End = end;
        }

        public DateTime Start { get; set; }
        public DateTime End { get; set; }
    }

根据问题进行测试(使用AutoFixture库)

     void Test()
    {
        var many = new Fixture().CreateMany<Sample>(100);

        var groups = many.OrderBy(a => a.timestamp).GroupBy(TimeSpan.FromDays(365), a => a.timestamp).Select(a => a.Average(b => b.value)).ToArray();

    }

    public class Sample
    {
        public DateTime timestamp;
        public double value;
    }