我一直在寻找使用JIT的方法内联并登陆this article by Scott Hanselman。我进一步使用了他的代码,似乎虽然当代码在Release模式下运行时只有几个调用堆栈,但实际上似乎运行好像这些额外的帧仍然存在于已编译的代码中(即使它们确实存在)不报告如此)。
首先,如果您想要跳转并运行它,我已将代码放在此处: https://github.com/Mike-EEE/StackOverflow.Performance
我在.NET 4.7.1,.NET Core 2.0,甚至是最近宣布的新.NET Core 2.1 Preview上尝试过这个。所有都具有相同的结果。
我所做的是创建一个发出消息的简单命令,然后创建一个后续的多重修饰命令,该命令多次包装这个简单的命令。在已发布的代码中,此装饰完成10次,从而产生具有10个级别的嵌套命令(如果计算origin simple命令,则为11)。
测试中使用的这两个命令都使用空委托来发出消息,因为在性能测试期间使用import android.graphics.Canvas;
import android.graphics.Path;
import android.graphics.drawable.Drawable;
import com.github.mikephil.charting.animation.ChartAnimator;
import com.github.mikephil.charting.data.Entry;
import com.github.mikephil.charting.interfaces.dataprovider.LineDataProvider;
import com.github.mikephil.charting.interfaces.datasets.ILineDataSet;
import com.github.mikephil.charting.renderer.LineChartRenderer;
import com.github.mikephil.charting.utils.Transformer;
import com.github.mikephil.charting.utils.ViewPortHandler;
import java.util.List;
public class MyLineLegendRenderer extends LineChartRenderer {
MyLineLegendRenderer(LineDataProvider chart, ChartAnimator animator, ViewPortHandler viewPortHandler) {
super(chart, animator, viewPortHandler);
}
// This method is same as its parent implementation. (Required so our version of generateFilledPath() is called.)
@Override
protected void drawLinearFill(Canvas c, ILineDataSet dataSet, Transformer trans, XBounds bounds) {
final Path filled = mGenerateFilledPathBuffer;
final int startingIndex = bounds.min;
final int endingIndex = bounds.range + bounds.min;
final int indexInterval = 128;
int currentStartIndex;
int currentEndIndex;
int iterations = 0;
// Doing this iteratively in order to avoid OutOfMemory errors that can happen on large bounds sets.
do {
currentStartIndex = startingIndex + (iterations * indexInterval);
currentEndIndex = currentStartIndex + indexInterval;
currentEndIndex = currentEndIndex > endingIndex ? endingIndex : currentEndIndex;
if (currentStartIndex <= currentEndIndex) {
generateFilledPath(dataSet, currentStartIndex, currentEndIndex, filled);
trans.pathValueToPixel(filled);
final Drawable drawable = dataSet.getFillDrawable();
if (drawable != null) {
drawFilledPath(c, filled, drawable);
}
else {
drawFilledPath(c, filled, dataSet.getFillColor(), dataSet.getFillAlpha());
}
}
iterations++;
} while (currentStartIndex <= currentEndIndex);
}
// This method defines the perimeter of the area to be filled for horizontal bezier data sets.
@Override
protected void drawCubicFill(Canvas c, ILineDataSet dataSet, Path spline, Transformer trans, XBounds bounds) {
final float phaseY = mAnimator.getPhaseY();
//Call the custom method to retrieve the dataset for other line
final List<Entry> boundaryEntries = ((MyFillFormatter)dataSet.getFillFormatter()).getFillLineBoundary();
// We are currently at top-last point, so draw down to the last boundary point
Entry boundaryEntry = boundaryEntries.get(bounds.min + bounds.range);
spline.lineTo(boundaryEntry.getX(), boundaryEntry.getY() * phaseY);
// Draw a cubic line going back through all the previous boundary points
Entry prev = dataSet.getEntryForIndex(bounds.min + bounds.range);
Entry cur = prev;
for (int x = bounds.min + bounds.range; x >= bounds.min; x--) {
prev = cur;
cur = boundaryEntries.get(x);
final float cpx = (prev.getX()) + (cur.getX() - prev.getX()) / 2.0f;
spline.cubicTo(
cpx, prev.getY() * phaseY,
cpx, cur.getY() * phaseY,
cur.getX(), cur.getY() * phaseY);
}
// Join up the perimeter
spline.close();
trans.pathValueToPixel(spline);
final Drawable drawable = dataSet.getFillDrawable();
if (drawable != null) {
drawFilledPath(c, spline, drawable);
}
else {
drawFilledPath(c, spline, dataSet.getFillColor(), dataSet.getFillAlpha());
}
}
// This method defines the perimeter of the area to be filled for straight-line (default) data sets.
private void generateFilledPath(final ILineDataSet dataSet, final int startIndex, final int endIndex, final Path outputPath) {
final float phaseY = mAnimator.getPhaseY();
final Path filled = outputPath; // Not sure if this is required, but this is done in the original code so preserving the same technique here.
filled.reset();
//Call the custom method to retrieve the dataset for other line
final List<Entry> boundaryEntries = ((MyFillFormatter)dataSet.getFillFormatter()).getFillLineBoundary();
final Entry entry = dataSet.getEntryForIndex(startIndex);
final Entry boundaryEntry = boundaryEntries.get(startIndex);
// Move down to boundary of first entry
filled.moveTo(entry.getX(), boundaryEntry.getY() * phaseY);
// Draw line up to value of first entry
filled.lineTo(entry.getX(), entry.getY() * phaseY);
// Draw line across to the values of the next entries
Entry currentEntry;
for (int x = startIndex + 1; x <= endIndex; x++) {
currentEntry = dataSet.getEntryForIndex(x);
filled.lineTo(currentEntry.getX(), currentEntry.getY() * phaseY);
}
// Draw down to the boundary value of the last entry, then back to the first boundary value
Entry boundaryEntry1;
for (int x = endIndex; x > startIndex; x--) {
boundaryEntry1 = boundaryEntries.get(x);
filled.lineTo(boundaryEntry1.getX(), boundaryEntry1.getY() * phaseY);
}
// Join up the perimeter
filled.close();
}
}
会变得相当丑陋。
在运行测试之前,我确实创建了一个使用与测试代码相同的代码的修饰命令,但是使用Console.WriteLine
来验证当前执行环境中的堆栈跟踪,而不是空委托。
在Debug中,此堆栈跟踪如下所示:
Console.WriteLine
在发布中,它看起来像这样:
at StackOverflow.Performance.EmitMessage.Emit(String message)
at StackOverflow.Performance.EmitMessage.MethodC(String message)
at StackOverflow.Performance.EmitMessage.MethodB(String message)
at StackOverflow.Performance.EmitMessage.MethodA(String message)
at StackOverflow.Performance.EmitMessage.Execute(String message)
at StackOverflow.Performance.DecoratedCommand.Execute(String message)
at StackOverflow.Performance.DecoratedCommand.Execute(String message)
at StackOverflow.Performance.DecoratedCommand.Execute(String message)
at StackOverflow.Performance.DecoratedCommand.Execute(String message)
at StackOverflow.Performance.DecoratedCommand.Execute(String message)
at StackOverflow.Performance.DecoratedCommand.Execute(String message)
at StackOverflow.Performance.DecoratedCommand.Execute(String message)
at StackOverflow.Performance.DecoratedCommand.Execute(String message)
at StackOverflow.Performance.DecoratedCommand.Execute(String message)
at StackOverflow.Performance.DecoratedCommand.Execute(String message)
at StackOverflow.Performance.Program.Main()
到目前为止,一切看起来都很棒,而且正是我所期待的。但是,然后我通过BenchmarkDotNet执行这两个命令,以查看性能设置中的结果。这些结果似乎表明装饰命令的调用链是完整执行的,即使发出的堆栈跟踪表明不存在这样的调用链:
at StackOverflow.Performance.EmitMessage.Emit(String message)
at StackOverflow.Performance.Program.Main()
所以,这里似乎有超过2帧正在执行,这使我在StackOverflow上发布了这个问题。我对此有几个问题:
为了完整起见,以下是运行此示例的所有代码:
// * Summary *
BenchmarkDotNet=v0.10.14, OS=Windows 10.0.16299.371 (1709/FallCreatorsUpdate/Redstone3)
Intel Core i7-4820K CPU 3.70GHz (Haswell), 1 CPU, 8 logical and 8 physical cores
.NET Core SDK=2.1.300-preview2-008533
[Host] : .NET Core 2.0.7 (CoreCLR 4.6.26328.01, CoreFX 4.6.26403.03), 64bit RyuJIT
DefaultJob : .NET Core 2.0.7 (CoreCLR 4.6.26328.01, CoreFX 4.6.26403.03), 64bit RyuJIT
Method | Mean | Error | StdDev |
---------- |----------:|----------:|----------:|
Direct | 3.581 ns | 0.0759 ns | 0.0710 ns |
Decorated | 44.646 ns | 0.7701 ns | 0.7203 ns |
提前感谢您提供的任何见解/帮助!
答案 0 :(得分:1)
从here
无耻地复制Stephen Toub的作品我刚看了一下装饰委员会的反汇编程序使用核对的coreclr构建并使用
setCOMPlus_JitDisasm=Execute
运行,请参阅documentation。实际上它正在使用尾调用:方法
DecoratedCommand:Execute(ref)
的汇编列表:此使用AVX发送X64 CPU的BLENDED_CODE
优化代码
基于rsp的框架
完全可以中断
最终的局部变量分配
V00 [V00,T00](3,3)ref - > rcx这个类-hnd
V01 arg1 [V01,T01](3,3)ref - &gt; rdx class-hnd
;#V02 OutArgs [V02](1,1)lclBlk(0)[rsp + 0x00]
Lcl帧大小= 0
G_M223_IG01:
G_M223_IG02:
488B4908 mov rcx,gword ptr [rcx + 8]
49BB48007733FD7F0000 mov r11,0x7FFD33770048
488B05934FE5FF mov rax,qword ptr [(reloc)]
3909 cmp dword ptr [rcx],ecx
G_M223_IG03:
48FFE0 rex.jmp rax