问题:我有两种对象,我们称之为Building
和Improvement
。大约有30个Improvement
个实例,而可能有1-1000个Building
个。对于Building
和Improvement
的每个组合,我必须执行一些繁重的计算,并将结果存储在Result
对象中。
Building
和Improvement
都可以用整数ID表示。
然后我需要能够:
Result
和Building
的{{1}} Improvement
的所有Result
对Improvement
执行汇总,例如.Sum()和.Average()Building
的所有Result
对Building
执行相同的汇总这将在网络服务器后端发生,因此内存可能是一个问题,但速度是最重要的。
到目前为止的想法:
Improvement
作为关键字Dictionary<Tuple<int, int>, Result>
。这应该给我快速插入和单个查找,但我担心<BuildingID, ImprovementID>
和.Where()
性能。.Sum()
个,一个维度为BuildingID
个,ImprovementID
为值。此外,构建两个Result
,将Dictionary<int, int>
和BuildingID
映射到各自的数组行/列索引。这可能意味着最多1000 + ImprovementID
s,这会是一个问题吗?Dictionary
。我认为这可能效率最低,有O(n)插入,但我可能错了。我在这里错过了一个明显更好的选择吗?
编辑:结果只是我感兴趣的聚合值(每List<Tuple<int, int, Result>>
和每Building
);看到我的回答。
答案 0 :(得分:3)
通常,词典是查找效率最高的。当通过密钥访问时,查找效率和操作效率都是常数O(1)。这将有助于访问,第一点。
在第二个和第三个中你需要遍历所有项目O(n),所以没有办法加速它,除非你想按照指定的顺序O(n * n)走过它们 - 那么你可以使用SortedDictionray O(n),但是你会损害查找和操作效率(O(log n))。
所以我会选择你发布的第一个解决方案。
答案 1 :(得分:2)
您可以使用“词典词典”来保存结果数据,例如:
// Building ID ↓ ↓ Improvement ID
var data = new Dictionary<int, Dictionary<int, Result>>();
这可以让您快速找到特定建筑的改进。
然而,找到包含特定改进的建筑物需要迭代所有建筑物。这是一些示例代码:
using System;
using System.Linq;
using System.Collections.Generic;
namespace Demo
{
sealed class Result
{
public double Data;
}
sealed class Building
{
public int Id;
public int Value;
}
sealed class Improvement
{
public int Id;
public int Value;
}
class Program
{
void run()
{
// Building ID ↓ ↓ Improvement ID
var data = new Dictionary<int, Dictionary<int, Result>>();
for (int buildingKey = 1000; buildingKey < 2000; ++buildingKey)
{
var improvements = new Dictionary<int, Result>();
for (int improvementKey = 5000; improvementKey < 5030; ++improvementKey)
improvements.Add(improvementKey, new Result{ Data = buildingKey + improvementKey/1000.0 });
data.Add(buildingKey, improvements);
}
// Aggregate data for all improvements for building with ID == 1500:
int buildingId = 1500;
var sum = data[buildingId].Sum(result => result.Value.Data);
Console.WriteLine(sum);
// Aggregate data for all buildings with a given improvement.
int improvementId = 5010;
sum = data.Sum(improvements =>
{
Result result;
return improvements.Value.TryGetValue(improvementId, out result) ? result.Data : 0.0;
});
Console.WriteLine(sum);
}
static void Main()
{
new Program().run();
}
}
}
为了加快第二次聚合(对于使用给定ID的所有改进的数据求和),我们可以使用第二个字典:
// Improvment ID ↓ ↓ Building ID
var byImprovementId = new Dictionary<int, Dictionary<int, Result>>();
你需要一个额外的字典来维护,但它并不太复杂。像这样的几个嵌套字典可能会占用太多内存 - 但值得考虑。
如下面的评论中所述,最好定义ID的类型以及字典本身。把它放在一起给出了:
using System;
using System.Linq;
using System.Collections.Generic;
namespace Demo
{
sealed class Result
{
public double Data;
}
sealed class BuildingId
{
public BuildingId(int id)
{
Id = id;
}
public readonly int Id;
public override int GetHashCode()
{
return Id.GetHashCode();
}
public override bool Equals(object obj)
{
var other = obj as BuildingId;
if (other == null)
return false;
return this.Id == other.Id;
}
}
sealed class ImprovementId
{
public ImprovementId(int id)
{
Id = id;
}
public readonly int Id;
public override int GetHashCode()
{
return Id.GetHashCode();
}
public override bool Equals(object obj)
{
var other = obj as ImprovementId;
if (other == null)
return false;
return this.Id == other.Id;
}
}
sealed class Building
{
public BuildingId Id;
public int Value;
}
sealed class Improvement
{
public ImprovementId Id;
public int Value;
}
sealed class BuildingResults : Dictionary<BuildingId, Result>{}
sealed class ImprovementResults: Dictionary<ImprovementId, Result>{}
sealed class BuildingsById: Dictionary<BuildingId, ImprovementResults>{}
sealed class ImprovementsById: Dictionary<ImprovementId, BuildingResults>{}
class Program
{
void run()
{
var byBuildingId = CreateTestBuildingsById(); // Create some test data.
var byImprovementId = CreateImprovementsById(byBuildingId); // Create the alternative lookup dictionaries.
// Aggregate data for all improvements for building with ID == 1500:
BuildingId buildingId = new BuildingId(1500);
var sum = byBuildingId[buildingId].Sum(result => result.Value.Data);
Console.WriteLine(sum);
// Aggregate data for all buildings with a given improvement.
ImprovementId improvementId = new ImprovementId(5010);
sum = byBuildingId.Sum(improvements =>
{
Result result;
return improvements.Value.TryGetValue(improvementId, out result) ? result.Data : 0.0;
});
Console.WriteLine(sum);
// Aggregate data for all buildings with a given improvement using byImprovementId.
// This will be much faster than the above Linq.
sum = byImprovementId[improvementId].Sum(result => result.Value.Data);
Console.WriteLine(sum);
}
static BuildingsById CreateTestBuildingsById()
{
var byBuildingId = new BuildingsById();
for (int buildingKey = 1000; buildingKey < 2000; ++buildingKey)
{
var improvements = new ImprovementResults();
for (int improvementKey = 5000; improvementKey < 5030; ++improvementKey)
{
improvements.Add
(
new ImprovementId(improvementKey),
new Result
{
Data = buildingKey + improvementKey/1000.0
}
);
}
byBuildingId.Add(new BuildingId(buildingKey), improvements);
}
return byBuildingId;
}
static ImprovementsById CreateImprovementsById(BuildingsById byBuildingId)
{
var byImprovementId = new ImprovementsById();
foreach (var improvements in byBuildingId)
{
foreach (var improvement in improvements.Value)
{
if (!byImprovementId.ContainsKey(improvement.Key))
byImprovementId[improvement.Key] = new BuildingResults();
byImprovementId[improvement.Key].Add(improvements.Key, improvement.Value);
}
}
return byImprovementId;
}
static void Main()
{
new Program().run();
}
}
}
最后,这是一个修改版本,它确定为特定改进聚合建筑/改进组合的所有实例的数据所花费的时间,并将元组字典的结果与字典词典进行比较。
我的RELEASE构建结果在任何调试器之外运行:
Dictionary of dictionaries took 00:00:00.2967741
Dictionary of tuples took 00:00:07.8164672
使用字典词典要快得多,但如果您打算进行许多这些聚合,这只是非常重要。
using System;
using System.Diagnostics;
using System.Linq;
using System.Collections.Generic;
namespace Demo
{
sealed class Result
{
public double Data;
}
sealed class BuildingId
{
public BuildingId(int id)
{
Id = id;
}
public readonly int Id;
public override int GetHashCode()
{
return Id.GetHashCode();
}
public override bool Equals(object obj)
{
var other = obj as BuildingId;
if (other == null)
return false;
return this.Id == other.Id;
}
}
sealed class ImprovementId
{
public ImprovementId(int id)
{
Id = id;
}
public readonly int Id;
public override int GetHashCode()
{
return Id.GetHashCode();
}
public override bool Equals(object obj)
{
var other = obj as ImprovementId;
if (other == null)
return false;
return this.Id == other.Id;
}
}
sealed class Building
{
public BuildingId Id;
public int Value;
}
sealed class Improvement
{
public ImprovementId Id;
public int Value;
}
sealed class BuildingResults : Dictionary<BuildingId, Result>{}
sealed class ImprovementResults: Dictionary<ImprovementId, Result>{}
sealed class BuildingsById: Dictionary<BuildingId, ImprovementResults>{}
sealed class ImprovementsById: Dictionary<ImprovementId, BuildingResults>{}
class Program
{
void run()
{
var byBuildingId = CreateTestBuildingsById(); // Create some test data.
var byImprovementId = CreateImprovementsById(byBuildingId); // Create the alternative lookup dictionaries.
var testTuples = CreateTestTuples();
ImprovementId improvementId = new ImprovementId(5010);
int count = 10000;
Stopwatch sw = Stopwatch.StartNew();
for (int i = 0; i < count; ++i)
byImprovementId[improvementId].Sum(result => result.Value.Data);
Console.WriteLine("Dictionary of dictionaries took " + sw.Elapsed);
sw.Restart();
for (int i = 0; i < count; ++i)
testTuples.Where(result => result.Key.Item2.Equals(improvementId)).Sum(item => item.Value.Data);
Console.WriteLine("Dictionary of tuples took " + sw.Elapsed);
}
static Dictionary<Tuple<BuildingId, ImprovementId>, Result> CreateTestTuples()
{
var result = new Dictionary<Tuple<BuildingId, ImprovementId>, Result>();
for (int buildingKey = 1000; buildingKey < 2000; ++buildingKey)
for (int improvementKey = 5000; improvementKey < 5030; ++improvementKey)
result.Add(
new Tuple<BuildingId, ImprovementId>(new BuildingId(buildingKey), new ImprovementId(improvementKey)),
new Result
{
Data = buildingKey + improvementKey/1000.0
});
return result;
}
static BuildingsById CreateTestBuildingsById()
{
var byBuildingId = new BuildingsById();
for (int buildingKey = 1000; buildingKey < 2000; ++buildingKey)
{
var improvements = new ImprovementResults();
for (int improvementKey = 5000; improvementKey < 5030; ++improvementKey)
{
improvements.Add
(
new ImprovementId(improvementKey),
new Result
{
Data = buildingKey + improvementKey/1000.0
}
);
}
byBuildingId.Add(new BuildingId(buildingKey), improvements);
}
return byBuildingId;
}
static ImprovementsById CreateImprovementsById(BuildingsById byBuildingId)
{
var byImprovementId = new ImprovementsById();
foreach (var improvements in byBuildingId)
{
foreach (var improvement in improvements.Value)
{
if (!byImprovementId.ContainsKey(improvement.Key))
byImprovementId[improvement.Key] = new BuildingResults();
byImprovementId[improvement.Key].Add(improvements.Key, improvement.Value);
}
}
return byImprovementId;
}
static void Main()
{
new Program().run();
}
}
}
答案 2 :(得分:0)
感谢您的回答,测试代码确实非常有用:)
我的解决方案原来是放弃LINQ,并在繁重的计算后直接手动执行聚合,因为我不得不迭代构建和改进的每个组合。
另外,我必须使用对象本身作为键,以便在将对象持久化到Entity Framework之前执行计算(即它们的ID都是0)。
代码:
public class Building {
public int ID { get; set; }
...
}
public class Improvement {
public int ID { get; set; }
...
}
public class Result {
public decimal Foo { get; set; }
public long Bar { get; set; }
...
public void Add(Result result) {
Foo += result.Foo;
Bar += result.Bar;
...
}
}
public class Calculator {
public Dictionary<Building, Result> ResultsByBuilding;
public Dictionary<Improvement, Result> ResultsByImprovement;
public void CalculateAndAggregate(IEnumerable<Building> buildings, IEnumerable<Improvement> improvements) {
ResultsByBuilding = new Dictionary<Building, Result>();
ResultsByImprovement = new Dictionary<Improvement, Result>();
for (building in buildings) {
for (improvement in improvements) {
Result result = DoHeavyCalculation(building, improvement);
if (ResultsByBuilding.ContainsKey(building)) {
ResultsByBuilding[building].Add(result);
} else {
ResultsByBuilding[building] = result;
}
if (ResultsByImprovement.ContainsKey(improvement)) {
ResultsByImprovement[improvement].Add(result);
} else {
ResultsByImprovement[improvement] = result;
}
}
}
}
}
public static void Main() {
var calculator = new Calculator();
IList<Building> buildings = GetBuildingsFromRepository();
IList<Improvement> improvements = GetImprovementsFromRepository();
calculator.CalculateAndAggregate(buildings, improvements);
DoStuffWithResults(calculator);
}
我是这样做的,因为我确切地知道我想要的聚合;如果我需要一种更有活力的方法,我可能会选择@ MatthewWatson的词典词典。