Question

在Mac OS上使用mono，如果我编译并分析下面的程序，我会得到以下结果：

% fsharpc --nologo -g foo.fs -o foo.exe
% mono --profile=default:stat foo.exe
...
Statistical samples summary
    Sample type: cycles
    Unmanaged hits:     336 (49.1%)
    Managed hits:       349 (50.9%)
    Unresolved hits:      1 ( 0.1%)
  Hits      % Method name
   154  22.48 Microsoft.FSharp.Collections.SetTreeModule:height ...
   105  15.33 semaphore_wait_trap
    74  10.80 Microsoft.FSharp.Collections.SetTreeModule:add ...
...

请注意第二个条目semaphore_wait_trap。这是程序：

[<EntryPoint>]
let main args = 
    let s = seq { 1..1000000 } |> Set.ofSeq
    s |> Seq.iter (fun _ -> ())
    0

我查看source for the Set module，但我没有发现任何（明显的）锁定。

我的单线程程序真的花了15％的执行时间搞乱信号量吗？如果是的话，我可以不这样做并获得性能提升吗？

Answer 1

根据Instruments，它是sgen / gc调用semaphore_wait_trap：

enter image description here

Sgen is documented在收集所有其他线程时停止：

在进行收藏（次要或主要）之前，收藏家必须停止所有正在运行的线程，以便它可以有一个稳定的电流视图堆的状态，没有其他线程改变它

换句话说，当代码试图分配内存并且需要GC时，它所花费的时间显示在semaphore_wait_trap下，因为那是你的应用程序线程。我怀疑mono profiler没有分析gc线程本身，所以你没有在集合代码中看到时间。

然后，锗烷输出实际上是GC摘要：

GC summary
    GC resizes: 0
    Max heap size: 0
    Object moves: 1002691
    Gen0 collections: 123, max time: 14187us, total time: 354803us, average: 2884us
    Gen1 collections: 3, max time: 41336us, total time: 60281us, average: 20093us

如果您希望代码运行得更快，请不要经常收集。

了解收集的实际成本可以通过自sgen has dtrace probes以后的dtrace完成。

单线程程序配置文件在semaphore_wait_trap中占运行时间的15％

1 个答案: