为什么这个LINQ查询(Id是Structure对象中long类型的属性):
IList<Structure> theStructures = new List<Structure>();
public int GetChildrenSlow(Structure aStructure){
IEnumerable<Structure> childrenQuery =
from structure in theStructures
where structure.ParentStructureId == aStructure.Id
select structure;
int count = childrenQuery.Count();
//Functionality continues...
}
比这个跑得慢:
IList<Structure> theStructures = new List<Structure>();
public int GetChildrenFast(long aStructureId){
IEnumerable<Structure> childrenQuery =
from structure in theStructures
where structure.ParentStructureId == aStructureId
select structure;
int count = childrenQuery.Count();
//Functionality continues...
}
我正在拨打这个电话数千次(递归)并且使用该属性比直接使用long要慢得多。如果我将Id拉出并将其存储在之前的长变量中,我执行LINQ命令,速度几乎等于GetChildrenFast
的速度。为什么在LINQ中使用对象属性比使用原语慢?
工作示例:
namespace ConsoleApplication1
{
class Structure
{
public int Id
{
get; set;
}
public int ParentStructureId
{
get; set;
}
}
class Program
{
private IList<Structure> theStructures = new List<Structure>();
public Structure FirstStructure
{
get; set;
}
private int FastCountStructureChildren(long aStructureId)
{
IEnumerable<Structure> childrenQuery =
from structure in theStructures
where structure.ParentStructureId == aStructureId
select structure;
int count = childrenQuery.Count();
foreach(Structure childStructure in childrenQuery)
{
count += FastCountStructureChildren(childStructure.Id);
}
return count;
}
private int SlowCountStructureChildren(Structure aStructure)
{
IEnumerable<Structure> childrenQuery =
from structure in theStructures
where structure.ParentStructureId == aStructure.Id
select structure;
int count = childrenQuery.Count();
foreach(Structure childStructure in childrenQuery)
{
count += SlowCountStructureChildren(childStructure);
}
return count;
}
public void BuildStructure()
{
FirstStructure = new Structure{Id = 0, ParentStructureId = -1};
theStructures.Add(FirstStructure);
//The loop only goes to 6000 as any more than that causes
//a StackOverflowException my development machine.
for(int i=1; i<6000; i++)
{
Structure newStructure = new Structure{Id = i,ParentStructureId = i - 1};
theStructures.Add(newStructure);
}
}
static void Main(string[] args)
{
Program program = new Program();
program.BuildStructure();
Stopwatch fastStopwatch = new Stopwatch();
fastStopwatch.Start();
program.FastCountStructureChildren(0);
fastStopwatch.Stop();
Stopwatch slowStopwatch = new Stopwatch();
slowStopwatch.Start();
program.SlowCountStructureChildren(program.FirstStructure);
slowStopwatch.Stop();
Console.WriteLine("Fast time: " + fastStopwatch.Elapsed);
Console.WriteLine("Slow time: " + slowStopwatch.Elapsed);
Console.ReadLine();
}
}
}
答案 0 :(得分:2)
按照提供的方式运行完整示例
Fast time: 00:00:01.6187793
Slow time: 00:00:01.3977344
只有我在调试模式下运行才会慢慢实现慢速运行。这是因为在调试模式下,方法永远不会内联,并且在任何地方都会散落着NOP以允许您破坏,例如在Id getter里面。
由于你显然关心运行速度,我会指出一个无关的低效率:你运行查询两次:一次用于计数,一次用于迭代子项。仅运行一次(并在循环中将计数增加1)应该可以加快速度。
顺便说一下,我通常解决这个问题的方法是,如果用{id}直接调用GetChildren
方法是有意义的,提供两个重载。否则,提供一个(Structure
)重载并在查询之前获取id,如long id = aStructure.id;
。
答案 1 :(得分:1)
好吧,即使属性访问是内联的,我仍然需要对每次迭代进行无效检查,我怀疑。这是一个额外的条件,例如可能会搞砸分支预测。
玩一个完整的例子会很有趣,但我怀疑这只是你在每个委托调用上执行额外操作的事实。也有可能“额外的一点点”已经关闭了与委托相关的其他一些内联,导致了一种多米诺骨牌性能效应。
答案 2 :(得分:0)
Long是一个结构体,它具有与对象不同的构造和内存占用,这显然更慢,我相信更大。
答案 3 :(得分:0)
在“功能继续”中,您是否再次使用childQuery?你是否意识到每次重新列举结构?不要多次枚举大型数据集,并且每个项目的属性访问成本不会太差。
IList<Structure> theStructures = new List<Structure>();
ILookup<int, Structure> byParentId = null;
public int GetChildren(Structure aStructure){
if (byParentId = null)
{
byParentId = theStructures.ToLookup(x => x.ParentStructureId);
}
List<Structure> children = byParentId[aStructure.Id].ToList();
int count = children.Count;
//Functionality continues...
}
答案 4 :(得分:0)
由于C#中允许出现副作用,因此无法轻易地将属性值静态地确定为可安全缓存。例如,假设这是您的代码:
public IEnumerable<Structure> FetchChildren()
{
for (int i = 0; i < 10; i++)
{
aStructure.Id++;
yield return GetChild(a.Structure.Id);
}
}
public int GetChildrenSlow(Structure aStructure){
IEnumerable<Structure> childrenQuery =
from structure in FetchChildren()
where structure.ParentStructureId == aStructure.Id
select structure;
int count = childrenQuery.Count();
//Functionality continues...
}
如您所见,aStructure.Id
会在您枚举时发生变化。是的,在您的情况下,您的枚举代码都没有副作用,但C#并不够聪明,无法知道。此外,不仅枚举可能会产生副作用。例如:
IList<Structure> theStructures = new List<Structure>();
public int GetChildrenSlow(Structure aStructure){
IEnumerable<Structure> childrenQuery =
theStructures.Where(s => s.ParentStructureId == aStructure.Id++);
int count = childrenQuery.Count();
//Functionality continues...
}
总是有多线程可以搞砸了。由于存在突变的可能性,您需要检查属性值的命中是必要的。