Question

以下代码是我的代码，用于计算pi = 3.1415 ...大约使用this公式：

use Time;
var timer = new Timer();

config const n = 10**9;
var x = 0.0, s = 0.0;

// timer.start();                                     // [1]_____

for k in 0 .. n {
    s = ( if k % 2 == 0 then 1.0 else -1.0 );  // (-1)^k
    x += s / ( 2.0 * k + 1.0 );
}

// timer.stop();                                      // [2]_____
// writeln( "time = ", timer.elapsed() );             // [3]_____

   writef( "pi (approx) = %30.20dr\n", x * 4 );
// writef( "pi (exact)  = %30.20dr\n", pi );          // [4]_____

当上述代码编译为chpl --fast test.chpl并执行为time ./a.out时，它会以 ~4秒运行

pi (approx) =         3.14159265458805059268

real    0m4.334s
user    0m4.333s
sys     0m0.006s

另一方面，如果我取消注释行[1--3]（使用Timer），程序运行速度会慢很多， ~10秒

time = 10.2284
pi (approx) =         3.14159265458805059268

real    0m10.238s
user    0m10.219s
sys     0m0.018s

当我仅取消注释行[4]（打印pi的内置值，行[1-3]被注释掉）时，同样的减速发生了：

pi (approx) =         3.14159265458805059268
pi (exact)  =         3.14159265358979311600

real    0m10.144s
user    0m10.141s
sys     0m0.009s

所以我想知道为什么会发生这种减速......

我是否遗漏了上述代码中的内容（例如，错误地使用了Timer）？

我的环境是通过自制软件安装的OSX10.11 + chapel-1.16。更多详情如下：

$ printchplenv --anonymize
CHPL_TARGET_PLATFORM: darwin
CHPL_TARGET_COMPILER: clang
CHPL_TARGET_ARCH: native
CHPL_LOCALE_MODEL: flat
CHPL_COMM: none
CHPL_TASKS: qthreads
CHPL_LAUNCHER: none
CHPL_TIMERS: generic
CHPL_UNWIND: none
CHPL_MEM: jemalloc
CHPL_MAKE: make
CHPL_ATOMICS: intrinsics
CHPL_GMP: gmp
CHPL_HWLOC: hwloc
CHPL_REGEXP: re2
CHPL_WIDE_POINTERS: struct
CHPL_AUX_FILESYS: none

$ clang --version
Apple LLVM version 8.0.0 (clang-800.0.42.1)
Target: x86_64-apple-darwin15.6.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin

更新

根据建议，我通过关注this和this页面并将CHPL_TARGET_COMPILER=gnu添加到~/.chplconfig（在运行make之前）从源代码安装了Chapel。然后，上述三种情况都以约4秒的速度运行。所以，问题可能与OSX10.11上的clang有关。根据评论，较新的OSX（＆gt; = 10.12）没有这个问题，因此升级到更新的OSX / clang（＆gt; = 9.0）可能就足够了。仅供参考，更新的环境信息（使用GNU）如下：

$ printchplenv --anonymize
CHPL_TARGET_PLATFORM: darwin
CHPL_TARGET_COMPILER: gnu +
CHPL_TARGET_ARCH: native
CHPL_LOCALE_MODEL: flat
CHPL_COMM: none
CHPL_TASKS: qthreads
CHPL_LAUNCHER: none
CHPL_TIMERS: generic
CHPL_UNWIND: none
CHPL_MEM: jemalloc
CHPL_MAKE: make
CHPL_ATOMICS: intrinsics
CHPL_GMP: none
CHPL_HWLOC: hwloc
CHPL_REGEXP: none
CHPL_WIDE_POINTERS: struct
CHPL_AUX_FILESYS: none

Answer 1

我是否遗漏了上述代码中的内容（例如，错误使用了Timer）？

不，你没有遗漏任何东西，并且以完全合理的方式使用Timer（和Chapel）。根据我自己的实验（证实了你的实验并在你的问题的评论中注明），这看起来是一个后端编译器问题，而不是Chapel中的基本问题或你使用它。

Answer 2

`[--fast]`减少了运行时检查，但not the issue may re-run here

请注意，设置/操作附加开销有多大，
为了教育目的而带来的（尝试并发处理），使 forall -constructor配备 Atomics .add()方法，由于在[PAR]启用的流程部分（ref. newly re-formulated Amdahl's Law）上的这些太薄 {{{} {{}}，因此会产生比并行处理更高的开销。 1}} -gains v / s确实对 [PAR] -costs的附加开销过高。

示范性消息。

[SEQ]

使用Timer时pi计算减慢

2 个答案:

`[--fast]`减少了运行时检查，但not the issue may re-run here

使用Timer时pi计算减慢

2 个答案:

[--fast]减少了运行时检查，但not the issue may re-run here

`[--fast]`减少了运行时检查，但not the issue may re-run here