我在DataTable中有大约300K行。第一列是" utcDT"其中包含一个带分钟的DateTime。
我希望按日期将数据分组到" ReportDailyData"的列表中。我的方法如下,但需要大约8秒才能运行。我需要明快加快这一点。
有更好的方法吗?
Initialize
答案 0 :(得分:0)
我建议:对utcDT进行排序,然后线性枚举结果,并手动将分组和聚合分配到新的数据结构中。对于您遇到的每个新utcDT值,创建一个新的ReportDailyData实例,然后开始将值聚合到其中,直到utcDT具有相同的值。
答案 1 :(得分:0)
如果我理解正确 - 您想对某些数据集合进行分组,那是对吗?
如果是这样 - 为什么不使用linq:GroupBy方法?
一个简单的例子如下:
void Main()
{
var data = new List<MyData>();
data.Add(new MyData() { UtcDT = DateTime.UtcNow, Volume = 1 });
data.Add(new MyData() { UtcDT = DateTime.UtcNow.AddDays(-1), Volume = 1 });
data.Add(new MyData() { UtcDT = DateTime.UtcNow.AddDays(-1), Volume = 4 });
data.Add(new MyData() { UtcDT = DateTime.UtcNow.AddDays(-2), Volume = 5 });
var result = GroupReportDataAndFormat(data);
}
public Dictionary<DateTime, int> GroupReportDataAndFormat(List<MyData> data)
{
return data.GroupBy(t => t.UtcDT.Date).ToDictionary(k => k.Key, v => v.Sum(s => s.Volume));
}
public class MyData
{
public DateTime UtcDT { get; set; }
public int Volume { get; set; }
}
当然 - 出于性能原因,您可能应该在数据库级别进行分组(撰写查询以返回已经分组的数据)
===更新=====
MainInMoon:我已经更新了适合您案例的解决方案:
void Main()
{
var data = new List<MyData>();
data.Add(new MyData() { UtcDT = DateTime.UtcNow, DayPnl = 1, Positions = 3 });
data.Add(new MyData() { UtcDT = DateTime.UtcNow.AddDays(-1), DayPnl = 1, Positions = 4 });
data.Add(new MyData() { UtcDT = DateTime.UtcNow.AddDays(-1), DayPnl = 4, Positions = 5 });
data.Add(new MyData() { UtcDT = DateTime.UtcNow.AddDays(-2), DayPnl = 5, Positions = 6 });
var result = GroupReportDataAndFormat(data);
}
public Dictionary<DateTime, GroupResult> GroupReportDataAndFormat(List<MyData> data)
{
return data.GroupBy(t => t.UtcDT.Date).ToDictionary(
k => k.Key, v => new GroupResult
{
DayPnlSum = v.Sum(s => s.DayPnl),
Deltas = v.Select(t => t.Positions).Zip(v.Select(s => s.Positions).Skip(1), (current, next) => next - current)
});
}
public class GroupResult
{
public double DayPnlSum { get; set; }
public IEnumerable<double> Deltas { get; set; }
public int TradeCount
{
get
{
return Deltas.Where(x => x != 0).Count();
}
}
public int Volume
{
get
{
return (int)Deltas.Where(x => x != 0).Sum(x => Math.Abs(x));
}
}
}
public class MyData
{
public DateTime UtcDT { get; set; }
public int DayPnl { get; set; }
public double Positions { get; set; }
}
当然,您可以更改在分组期间计算的TradeCount和Volume属性(不是延迟加载)