我在.NET Source Code中发现了这一点:它声称比System.Double.IsNaN
快100倍。是否有理由不使用此功能而不是System.Double.IsNaN
?
[StructLayout(LayoutKind.Explicit)]
private struct NanUnion
{
[FieldOffset(0)] internal double DoubleValue;
[FieldOffset(0)] internal UInt64 UintValue;
}
// The standard CLR double.IsNaN() function is approximately 100 times slower than our own wrapper,
// so please make sure to use DoubleUtil.IsNaN() in performance sensitive code.
// PS item that tracks the CLR improvement is DevDiv Schedule : 26916.
// IEEE 754 : If the argument is any value in the range 0x7ff0000000000001L through 0x7fffffffffffffffL
// or in the range 0xfff0000000000001L through 0xffffffffffffffffL, the result will be NaN.
public static bool IsNaN(double value)
{
NanUnion t = new NanUnion();
t.DoubleValue = value;
UInt64 exp = t.UintValue & 0xfff0000000000000;
UInt64 man = t.UintValue & 0x000fffffffffffff;
return (exp == 0x7ff0000000000000 || exp == 0xfff0000000000000) && (man != 0);
}
编辑:仍为according to the .NET Source Code,System.Double.IsNaN
的代码如下:
public unsafe static bool IsNaN(double d)
{
return (*(UInt64*)(&d) & 0x7FFFFFFFFFFFFFFFL) > 0x7FF0000000000000L;
}
答案 0 :(得分:59)
它声称比System.Double.IsNaN
快100倍
是的,使用是真的。您错过了时间机器来了解何时做出此决定。 Double.IsNaN()不习惯看起来像那样。来自SSCLI10源代码:
public static bool IsNaN(double d)
{
// Comparisions of a NaN with another number is always false and hence both conditions will be false.
if (d < 0d || d >= 0d) {
return false;
}
return true;
}
如果d
为NaN,则在32位代码中对FPU执行非常。只是芯片设计的一个方面,它在微代码中被视为特殊的。除了记录跟踪“浮点辅助”数量的处理器性能计数器并注意到微代码序列发生器对非正规和NaN起作用之外,英特尔处理器手册对此几乎没有什么说法,“可能会花费成本
数百个周期“。在64位代码中没有其他问题,它使用的SSE2指令没有这个性能。
要自己看一些代码:
using System;
using System.Diagnostics;
class Program {
static void Main(string[] args) {
double d = double.NaN;
for (int test = 0; test < 10; ++test) {
var sw1 = Stopwatch.StartNew();
bool result1 = false;
for (int ix = 0; ix < 1000 * 1000; ++ix) {
result1 |= double.IsNaN(d);
}
sw1.Stop();
var sw2 = Stopwatch.StartNew();
bool result2 = false;
for (int ix = 0; ix < 1000 * 1000; ++ix) {
result2 |= IsNaN(d);
}
sw2.Stop();
Console.WriteLine("{0} - {1} x {2}%", sw1.Elapsed, sw2.Elapsed, 100 * sw2.ElapsedTicks / sw1.ElapsedTicks, result1, result2);
}
Console.ReadLine();
}
public static bool IsNaN(double d) {
// Comparisions of a NaN with another number is always false and hence both conditions will be false.
if (d < 0d || d >= 0d) {
return false;
}
return true;
}
}
使用经过微优化的Double.IsNaN()版本。这样的微优化在框架中并不邪恶,微软.NET程序员的巨大负担是,他们很难猜测他们的代码何时处于应用程序的关键路径中。
在定位32位代码(Haswell移动核心)时,我的计算机上的结果:
00:00:00.0027095 - 00:00:00.2427242 x 8957%
00:00:00.0025248 - 00:00:00.2191291 x 8678%
00:00:00.0024344 - 00:00:00.2209950 x 9077%
00:00:00.0024144 - 00:00:00.2321169 x 9613%
00:00:00.0024126 - 00:00:00.2173313 x 9008%
00:00:00.0025488 - 00:00:00.2237517 x 8778%
00:00:00.0026940 - 00:00:00.2231146 x 8281%
00:00:00.0025052 - 00:00:00.2145660 x 8564%
00:00:00.0025533 - 00:00:00.2200943 x 8619%
00:00:00.0024406 - 00:00:00.2135839 x 8751%
答案 1 :(得分:12)
这是一个天真的基准:
public static void Main()
{
int iterations = 500 * 1000 * 1000;
double nan = double.NaN;
double notNan = 42;
Stopwatch sw = Stopwatch.StartNew();
bool isNan;
for (int i = 0; i < iterations; i++)
{
isNan = IsNaN(nan); // true
isNan = IsNaN(notNan); // false
}
sw.Stop();
Console.WriteLine("IsNaN: {0}", sw.ElapsedMilliseconds);
sw = Stopwatch.StartNew();
for (int i = 0; i < iterations; i++)
{
isNan = double.IsNaN(nan); // true
isNan = double.IsNaN(notNan); // false
}
sw.Stop();
Console.WriteLine("double.IsNaN: {0}", sw.ElapsedMilliseconds);
Console.Read();
}
显然他们错了:
IsNaN:15012
double.IsNaN:6243
编辑+注意:我确定时间会根据输入值,许多其他因素等而改变,但声称一般来说这个包装器比默认实现速度快100倍似乎错了。
答案 2 :(得分:7)
我称之为恶作剧。 &#34;快速&#34;版本具有相当大的操作数,甚至可以从内存执行更多读取(堆栈,所以在L1中但仍然比寄存器慢)。
00007FFAC53D3D01 movups xmmword ptr [rsp+8],xmm0
00007FFAC53D3D06 sub rsp,48h
00007FFAC53D3D0A mov qword ptr [rsp+20h],0
00007FFAC53D3D13 mov qword ptr [rsp+28h],0
00007FFAC53D3D1C mov qword ptr [rsp+30h],0
00007FFAC53D3D25 mov rax,7FFAC5423D40h
00007FFAC53D3D2F mov eax,dword ptr [rax]
00007FFAC53D3D31 test eax,eax
00007FFAC53D3D33 je 00007FFAC53D3D3A
00007FFAC53D3D35 call 00007FFB24EE39F0
00007FFAC53D3D3A mov r8d,8
00007FFAC53D3D40 xor edx,edx
00007FFAC53D3D42 lea rcx,[rsp+20h]
00007FFAC53D3D47 call 00007FFB24A21680
t.DoubleValue = value;
00007FFAC53D3D4C movsd xmm5,mmword ptr [rsp+50h]
00007FFAC53D3D52 movsd mmword ptr [rsp+20h],xmm5
UInt64 exp = t.UintValue & 0xfff0000000000000;
00007FFAC53D3D58 mov rax,qword ptr [rsp+20h]
00007FFAC53D3D5D mov rcx,0FFF0000000000000h
00007FFAC53D3D67 and rax,rcx
00007FFAC53D3D6A mov qword ptr [rsp+28h],rax
UInt64 man = t.UintValue & 0x000fffffffffffff;
00007FFAC53D3D6F mov rax,qword ptr [rsp+20h]
00007FFAC53D3D74 mov rcx,0FFFFFFFFFFFFFh
00007FFAC53D3D7E and rax,rcx
00007FFAC53D3D81 mov qword ptr [rsp+30h],rax
return (exp == 0x7ff0000000000000 || exp == 0xfff0000000000000) && (man != 0);
00007FFAC53D3D86 mov rax,7FF0000000000000h
00007FFAC53D3D90 cmp qword ptr [rsp+28h],rax
00007FFAC53D3D95 je 00007FFAC53D3DA8
00007FFAC53D3D97 mov rax,0FFF0000000000000h
00007FFAC53D3DA1 cmp qword ptr [rsp+28h],rax
00007FFAC53D3DA6 jne 00007FFAC53D3DBD
00007FFAC53D3DA8 xor eax,eax
00007FFAC53D3DAA cmp qword ptr [rsp+30h],0
00007FFAC53D3DB0 setne al
00007FFAC53D3DB3 mov dword ptr [rsp+38h],eax
00007FFAC53D3DB7 mov al,byte ptr [rsp+38h]
00007FFAC53D3DBB jmp 00007FFAC53D3DC1
00007FFAC53D3DBD xor eax,eax
00007FFAC53D3DBF jmp 00007FFAC53D3DC1
00007FFAC53D3DC1 nop
00007FFAC53D3DC2 add rsp,48h
00007FFAC53D3DC6 ret
与.NET版本对比:
return (*(UInt64*)(&d) & 0x7FFFFFFFFFFFFFFFL) > 0x7FF0000000000000L;
00007FFAC53D3DE0 movsd mmword ptr [rsp+8],xmm0
00007FFAC53D3DE6 sub rsp,38h
00007FFAC53D3DEA mov rax,7FFAC5423D40h
00007FFAC53D3DF4 mov eax,dword ptr [rax]
00007FFAC53D3DF6 test eax,eax
00007FFAC53D3DF8 je 00007FFAC53D3DFF
00007FFAC53D3DFA call 00007FFB24EE39F0
00007FFAC53D3DFF mov rdx,qword ptr [rsp+40h]
00007FFAC53D3E04 mov rax,7FFFFFFFFFFFFFFFh
00007FFAC53D3E0E and rdx,rax
00007FFAC53D3E11 xor ecx,ecx
00007FFAC53D3E13 mov rax,7FF0000000000000h
00007FFAC53D3E1D cmp rdx,rax
00007FFAC53D3E20 seta cl
00007FFAC53D3E23 mov dword ptr [rsp+20h],ecx
00007FFAC53D3E27 movzx eax,byte ptr [rsp+20h]
00007FFAC53D3E2C jmp 00007FFAC53D3E2E
00007FFAC53D3E2E nop
00007FFAC53D3E2F add rsp,38h
00007FFAC53D3E33 ret