Chapel-1.16.0预发布内部错误(-999)对于先前的域大小不可知的声明可能采用的其他方法是什么?

时间:2017-08-18 17:26:43

标签: chapel

原型设计已返回内部错误:

虽然这个特定设置的目的既不重要也不相关,
编译器完成了以下调试通知,
对于任何关于避免碰撞语法的建议都将受到赞赏:

<TiO>-IDE-Debug::____________________________________________________

.code.tio.chpl:77: internal error: IMP0586 chpl Version 1.16.0 pre-release (-999)

Note: This source location is a guess.

Internal errors indicate a bug in the Chapel compiler ("It's us, not you"),
and we're sorry for the hassle.  We would appreciate your reporting this bug -- 
please see http://chapel.cray.com/bugs.html for instructions.  In the meantime,
the filename + line number above may be useful in working around the issue.


(编译器团队显然会对观察到的情况的内部处理有一些额外的兴趣和担忧,这不是本文的主要目的或主题) < / p>

The code, live @ <TiO>-IDE::

/* ---------------------------------------SETUP-SECTION-UNDER-TEST--*/ use Time;
/* ---------------------------------------SETUP-SECTION-UNDER-TEST--*/ var aStopWATCH_RND_GEN: Timer;
/* ---------------------------------------SETUP-SECTION-UNDER-TEST--*/ var aStopWATCH_LIN_ALG: Timer;
/* ---------------------------------------SETUP-SECTION-UNDER-TEST--*/ var aStopWATCH_MAT_REC: Timer;
/* ---------------------------------------SETUP-SECTION-UNDER-TEST--*/ var aStopWATCH_ARR_REC: Timer;
config const n_power =         5;
config const L_size  =      1000;
       const indices = 1..L_size;
       const aDomain = {indices, indices};

       var   A: [aDomain] real(64); // real(32); // may've shown some byte-word alignment artifacts
       var   B: [aDomain] real(64); // real(32); // may've shown some byte-word alignment artifacts
       const dtype =    "-real(64)";
       var   S: [aDomain] real(64); // real(32); // OK: must've been set real(64) to avoid /LinearAlgebra.chpl:535: error: type mismatch in assignment from real(64) to real(32)

/* -----------------------------------------------------------------*/ use Random;
/* ---------------------------------------------SECTION-UNDER-TEST--*/     aStopWATCH_RND_GEN.start();
    Random.fillRandom(  A );
    Random.fillRandom(  B );
/* ---------------------------------------------SECTION-UNDER-TEST--*/     aStopWATCH_RND_GEN.stop();
/* 

   ============================================ */

proc arrMUL( arrA: [?DA] real(64),
             arrB: [?DB] real(64)
             ) {                      /*
                                         <Brad> If the domain/size of the array being returned cannot be described directly in the function prototype,
                                                I believe your best bet at present is to omit any description of the return type and lean on Chapel's type inference machinery
                                                to determine that you're returning an array

                                                >>> https://stackoverflow.com/a/39420337/3666197

                                                */
     var                       arrC: [aDomain] real(64);
                                      /*
                                                <TiO>-IDE-Debug::____________________________________________________

                                                .code.tio.chpl:77: internal error: IMP0586 chpl Version 1.16.0 pre-release (-999)

                                                Note: This source location is a guess.

                                                Internal errors indicate a bug in the Chapel compiler ("It's us, not you"),
                                                and we're sorry for the hassle.  We would appreciate your reporting this bug -- 
                                                please see http://chapel.cray.com/bugs.html for instructions.  In the meantime,
                                                the filename + line number above may be useful in working around the issue.

                                                */

 /*  var                       arrC: [{1..arrA.dim( 1 ).length(),       // ..#arrA.dim( 1 ),
                                       1..arrB.dim( 2 ).length()        // ..#arrB.dim( 2 )
                                       }
                                      ] real(64);

                                                <TiO>-IDE-Debug::____________________________________________________

                                                .code.tio.chpl:49: error: unresolved call '[domain(2,int(64),false)] real(64).dim(1)'
                                                $CHPL_HOME/modules/internal/ChapelArray.chpl:1215: note: candidates are: _domain.dim(d: int)
                                                $CHPL_HOME/modules/internal/ChapelArray.chpl:1218: note:                 _domain.dim(param d: int)

                                                */
  // forall      (row, col) in arrC.domain {    // [ROW:77] reports: internal error: IMP0586 chpl Version 1.16.0 pre-release (-999)
     forall      (row, col) in     aDomain {    // [ROW:78] reports: internal error: IMP0586 chpl Version 1.16.0 pre-release (-999) 
        for                              i in arrA.dim( 2 ) do
             arrC[row, col] += arrA[row, i]
                             * arrB[     i, col];
     }
     return  arrC;
}

proc arr_REC_POW( arrM: [?D] real(64),
                  n:          int(64) // int(32) failed:
                                      //      <- config const n_power = 5 // .code.tio.chpl:64: error: unresolved call 'arr_REC_POW([domain(2,int(64),false)] real(64), int(64))'
                  ):    [ D] real(64) {     /* 
                                                <Brad> If the domain/size of the array being returned cannot be described directly in the function prototype,
                                                       I believe your best bet at present is to omit any description of the return type and lean on Chapel's type inference machinery
                                                       to determine that you're returning an array

                                                       >>> https://stackoverflow.com/a/39420337/3666197

                                                <TiO>-IDE-Debug::____________________________________________________

                                                .code.tio.chpl:56: error: unable to resolve return type of function 'arr_REC_POW'
                                                .code.tio.chpl:56: In function 'arr_REC_POW':
                                                .code.tio.chpl:61: error: called recursively at this point


                                                // The ? operator is called the query operator, and is used to take
                                                // undetermined values like tuple or array sizes and generic types.
                                                // For example, taking arrays as parameters. The query operator is used to
                                                // determine the domain of A. This is uesful for defining the return type,
                                                // though it's not required.

                                                //                  (c) 2017 Ian J. Bertolacci, Ben Harshbarger
                                                // Originally contributed by Ian J. Bertolacci, and updated by 8 contributor(s).

                                                        >>> https://learnxinyminutes.com/docs/chapel/>
                                                */

     if      n < 1 then return         arrM;
     else               return arrMUL( arrM, arr_REC_POW( arrM, n - 1 ) );
}

/* ---------------------------------------------SECTION-UNDER-TEST--*/     aStopWATCH_ARR_REC.start();

   forall (row, col)             in S.domain {
         S[row, col] = arr_REC_POW( A, n_power )[row,col]
                     + arr_REC_POW( B, n_power )[row,col];
   }
/* ---------------------------------------------SECTION-UNDER-TEST--*/     aStopWATCH_ARR_REC.start();
/* 

   ============================================ */

<TiO>-IDE减少(遗憾的是没有代码折叠生产力,就像在其他IDE环境中一样。同意Ben,根据个人偏好,实验 - 审核自我记录布局可以更具可读性)

仍然

  

chpl:30: internal error: IMP0586 chpl Version 1.16.0 pre-release (-999)

chpl:30:正在:

forall      (row, col) in    aDomain {

>>> aClickThrough-with-an-updated-code, no syntax warnings but (-999) @ <TiO>-IDE

                    use Time;

var aStopWATCH_RND_GEN: Time.Timer;
var aStopWATCH_LIN_ALG: Time.Timer;
var aStopWATCH_MAT_REC: Time.Timer;
var aStopWATCH_ARR_REC: Time.Timer;

config const n_power =         5;
config const L_size  =      1000;
       const indices = 1..L_size;
       const aDomain = {indices, indices};

       var   A: [aDomain] real(64);
       var   B: [aDomain] real(64);
       const dtype =    "-real(64)";
       var   S: [aDomain] real(64);

use Random;
/* ---------------------------------------------SECTION-UNDER-TEST--*/     aStopWATCH_RND_GEN.start();
    Random.fillRandom(  A );
    Random.fillRandom(  B );
/* ---------------------------------------------SECTION-UNDER-TEST--*/     aStopWATCH_RND_GEN.stop();

proc arrMUL( arrA: [?DA] real(64),
             arrB: [?DB] real(64)
             ) {

     var     arrC: [aDomain] real(64);

     forall      (row, col) in    aDomain {
             arrC[row, col]  = 0;
        for                              i in arrA.dim( 2 ) do
             arrC[row, col] += arrA[row, i]
                             * arrB[     i, col];
     }
     return  arrC;
}

proc arr_REC_POW( arrM: [?D] real(64),
                  n:          int(64)
                  ):    [ D] real(64) {

     if      n < 1 then return         arrM;
     else               return arrMUL( arrM, arr_REC_POW( arrM, n - 1 ) );
}

/* ---------------------------------------------SECTION-UNDER-TEST--*/     aStopWATCH_ARR_REC.start();
   forall (row, col)             in S.domain {
         S[row, col] = arr_REC_POW( A, n_power )[row,col]
                     + arr_REC_POW( B, n_power )[row,col];
   }
/* ---------------------------------------------SECTION-UNDER-TEST--*/     aStopWATCH_ARR_REC.start();

     use LinearAlgebra;
var mA = LinearAlgebra.Matrix( A );
var mB = LinearAlgebra.Matrix( B );
var mS = LinearAlgebra.Matrix( S );
/* ---------------------------------------------SECTION-UNDER-TEST--*/     aStopWATCH_LIN_ALG.start();
    mS = LinearAlgebra.matPlus( LinearAlgebra.matPow( mA, n_power ),
                                LinearAlgebra.matPow( mB, n_power )
                                );
/* ---------------------------------------------SECTION-UNDER-TEST--*/     aStopWATCH_LIN_ALG.stop();

proc mat_REC_POW( matM: [] real(64),
                  n:        int(64)
                  ) {

     if      n < 1 then return                    matM;
     else               return LinearAlgebra.dot( matM, mat_REC_POW( matM, n - 1 ) );
}

/* -----------------------------------------------re-fill-m?[,]-----*/
    Random.fillRandom(  A ); mA = Matrix( A ); // re-fill mA[,]
    Random.fillRandom(  B ); mB = Matrix( B ); // re-fill mB[,]
/* -----------------------------------------------re-fill-m?[,]-----*/

/* ---------------------------------------------SECTION-UNDER-TEST--*/     aStopWATCH_MAT_REC.start();
   forall  (row, col)              in mS.domain {
         mS[row, col]  = mat_REC_POW( mA, n_power )[row,col]
                       + mat_REC_POW( mB, n_power )[row,col];
   }
/* ---------------------------------------------SECTION-UNDER-TEST--*/     aStopWATCH_MAT_REC.start();

/* |||||||||||||||||||||||||||||||||||||||||||||||||||||||||| PERF--*/

writeln( ".fillRandom() took",           aStopWATCH_RND_GEN.elapsed( Time.TimeUnits.microseconds ), " [us] for A[,], B[,] having ", 2 * ( L_size * L_size ), dtype, " elements in total." );
writeln(
        "\n <SECTION-UNDER-TEST> took ", aStopWATCH_LIN_ALG.elapsed( Time.TimeUnits.microseconds ), " [us] in [LIN_ALG] mode ( A^n + B^b ) for [", L_size, ",", L_size, "] on <TiO>-IDE",
        "\n <SECTION-UNDER-TEST> took ", aStopWATCH_MAT_REC.elapsed( Time.TimeUnits.microseconds ), " [us] in [MAT_REC] mode ( A^n + B^b ) for [", L_size, ",", L_size, "] on <TiO>-IDE",
        "\n <SECTION-UNDER-TEST> took ", aStopWATCH_ARR_REC.elapsed( Time.TimeUnits.microseconds ), " [us] in [ARR_REC] mode ( A^n + B^b ) for [", L_size, ",", L_size, "] on <TiO>-IDE"
         );
/* ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| INF--*/

writeln(                     "<TiO>-IDE-LocaleSpace is: ", LocaleSpace, " massive. Code is executing [here], being Locale ", here.id  );
for                                                i in    LocaleSpace do
    writeln(                 "          Locale #", i, "'s ID is: ", Locales[i].id );

1 个答案:

答案 0 :(得分:0)

任务完成!使用域助手,递归仍需要一些调查

非常感谢所有帮助实现这一目标的人。
< em>之前的WIP临时备注已留在原地for educational purposes
确定:代码现在通过初始编译器的语法检查,

BLAS + ATLAS 但根据v1.15 / .16文档建议尚无法工作(并且正在<TiO>-IDE管理员的帮助下解决)< /子>

                    use Time;

var aStopWATCH_RND_GEN: Time.Timer;
var aStopWATCH_LIN_ALG: Time.Timer;
var aStopWATCH_MAT_REC: Time.Timer;
var aStopWATCH_ARR_REC: Time.Timer;

config const n_power =         5;
config const L_size  =      1000;
       const indices = 1..L_size;
       const aDomain = {indices, indices};

       var   A: [aDomain] real(64);
       var   B: [aDomain] real(64);
       const dtype =    "-real(64)";
       var   S: [aDomain] real(64);

use Random;
/* ---------------------------------------------SECTION-UNDER-TEST--*/     aStopWATCH_RND_GEN.start();
    Random.fillRandom(  A );
    Random.fillRandom(  B );
/* ---------------------------------------------SECTION-UNDER-TEST--*/     aStopWATCH_RND_GEN.stop();

proc arrMUL( arrA: [?DA] real(64),
             arrB: [?DB] real(64)
             ) {

     var     arrC: [aDomain] real(64);

     forall      (row, col) in    aDomain {
             arrC[row, col]  = 0;
     // for                              i in arrA.dim( 2 ) do          // calling .dim(2) on an array instead of it's domain. Note that dim is only defined on the domain, not the array
        for                              i in arrA.domain.dim( 2 ) do   // calling .dim(2) on an array instead of it's domain. Note that dim is only defined on the domain, not the array
             arrC[row, col] += arrA[row, i]
                             * arrB[     i, col];
     }
     return  arrC;
}

proc arr_REC_POW( arrM: [?D] real(64),
                  n:          int(64)
                  ):    [ D] real(64) {

     if      n < 1 then return         arrM;
     else               return arrMUL( arrM, arr_REC_POW( arrM, n - 1 ) );
}

/* ---------------------------------------------SECTION-UNDER-TEST--*/     aStopWATCH_ARR_REC.start();
   forall (row, col)             in S.domain {
         S[row, col] = arr_REC_POW( A, n_power )[row,col]
                     + arr_REC_POW( B, n_power )[row,col];
   }
/* ---------------------------------------------SECTION-UNDER-TEST--*/     aStopWATCH_ARR_REC.start();

     use LinearAlgebra;
var mA = LinearAlgebra.Matrix( A );
var mB = LinearAlgebra.Matrix( B );
var mS = LinearAlgebra.Matrix( S );
/* ---------------------------------------------SECTION-UNDER-TEST--*/     aStopWATCH_LIN_ALG.start();
    mS = LinearAlgebra.matPlus( LinearAlgebra.matPow( mA, n_power ),
                                LinearAlgebra.matPow( mB, n_power )
                                );
/* ---------------------------------------------SECTION-UNDER-TEST--*/     aStopWATCH_LIN_ALG.stop();

proc mat_REC_POW( matM: [?Dm] real(64),
                  n:           int(64)
                  ):    [ Dm] real(64) {

 //  if      n < 1 then return                    matM;                                           // chpl:65: error: unable to resolve return type of function 'mat_REC_POW'
     if      n < 1 then return LinearAlgebra.dot( matM, LinearAlgebra.eye( matM.shape[1] ) );     // [DID NOT HELP]: added: so as to help compiler assume the return-type
     else               return LinearAlgebra.dot( matM,       mat_REC_POW( matM, n - 1 ) );       // chpl:70: error: called recursively at this point
}

/* -----------------------------------------------re-fill-m?[,]-----*/
    Random.fillRandom(  A ); mA = Matrix( A ); // re-fill mA[,]
    Random.fillRandom(  B ); mB = Matrix( B ); // re-fill mB[,]
/* -----------------------------------------------re-fill-m?[,]-----*/

/* ---------------------------------------------SECTION-UNDER-TEST--*/     aStopWATCH_MAT_REC.start();
   forall  (row, col)              in mS.domain {
         mS[row, col]  = mat_REC_POW( mA, n_power )[row,col]
                       + mat_REC_POW( mB, n_power )[row,col];
   }
/* ---------------------------------------------SECTION-UNDER-TEST--*/     aStopWATCH_MAT_REC.start();

/* |||||||||||||||||||||||||||||||||||||||||||||||||||||||||| PERF--*/

writeln( ".fillRandom() took",           aStopWATCH_RND_GEN.elapsed( Time.TimeUnits.microseconds ), " [us] for A[,], B[,] having ", 2 * ( L_size * L_size ), dtype, " elements in total." );
writeln(
        "\n <SECTION-UNDER-TEST> took ", aStopWATCH_LIN_ALG.elapsed( Time.TimeUnits.microseconds ), " [us] in [LIN_ALG] mode ( A^n + B^n ) for [", L_size, ",", L_size, "] on <TiO>-IDE",
        "\n <SECTION-UNDER-TEST> took ", aStopWATCH_MAT_REC.elapsed( Time.TimeUnits.microseconds ), " [us] in [MAT_REC] mode ( A^n + B^n ) for [", L_size, ",", L_size, "] on <TiO>-IDE",
        "\n <SECTION-UNDER-TEST> took ", aStopWATCH_ARR_REC.elapsed( Time.TimeUnits.microseconds ), " [us] in [ARR_REC] mode ( A^n + B^n ) for [", L_size, ",", L_size, "] on <TiO>-IDE"
         );
/* ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| INF--*/

writeln(                     "<TiO>-IDE-LocaleSpace is: ", LocaleSpace, " massive. Code is executing [here], being Locale ", here.id  );
for                                                i in    LocaleSpace do
    writeln(                 "          Locale #", i, "'s ID is: ", Locales[i].id );

BLAS + ATLAS protest if tried to get compiled/linked >>> @ <TiO>-IDE ,而管理员已确认已安装并审核/确认两个模块均已就位( [确定]:已解决< / strong>与<TiO>-IDE网站管理员和布拉德 - 这两个都值得非常感谢)

/usr/bin/ld: cannot find -lblas
/usr/bin/ld: cannot find -latlas

<TiO>-IDE管理员+布拉德的建议有助于其发挥作用

单一语言环境的进程性能,(线程版本 ATLAS ):

.fillRandom()         took  582125 [us] for A[,], B[,] having 2000000-real(64) elements in total.    
 <SECTION-UNDER-TEST> took 2702530 [us] in [LIN_ALG] mode ( A^n + B^n ) for [1000,1000] on <TiO>-IDE

--print-commands 编译器切换报告

gcc    -I/opt/chapel//lib/chapel/1.16/third-party/qthread/install/linux64-gnu-native-flat/include
       -I/opt/chapel//lib/chapel/1.16/third-party/hwloc/install/linux64-gnu-native-flat/include
       -DCHPL_TASKS_MODEL_H=\"tasks-qthreads.h\"
       -DCHPL_THREADS_MODEL_H=\"threads-none.h\"
       -DCHPL_WIDE_POINTER_STRUCT
       -DCHPL_JEMALLOC_PREFIX=chpl_je_
       -DCHPL_HAS_GMP
       -Wno-unused
       -Wno-uninitialized
       -Wno-pointer-sign
       -Wno-tautological-compare
       -Wno-stringop-overflow
       -Wno-strict-overflow
       -c
       -o /tmp/chpl-runner-15040.deleteme/.bin.tio.tmp.o
       -I/opt/chapel//lib/chapel/1.16/third-party/qthread/install/linux64-gnu-native-flat/include
       -I.
       -I/opt/chapel//lib/chapel/1.16/runtime/include/localeModels/flat
       -I/opt/chapel//lib/chapel/1.16/runtime/include/localeModels
       -I/opt/chapel//lib/chapel/1.16/runtime/include/comm/none
       -I/opt/chapel//lib/chapel/1.16/runtime/include/comm
       -I/opt/chapel//lib/chapel/1.16/runtime/include/tasks/qthreads
       -I/opt/chapel//lib/chapel/1.16/runtime/include/threads/none
       -I/opt/chapel//lib/chapel/1.16/runtime/include
       -I/opt/chapel//lib/chapel/1.16/runtime/include/qio
       -I/opt/chapel//lib/chapel/1.16/runtime/include/atomics/intrinsics
       -I/opt/chapel//lib/chapel/1.16/runtime/include/mem/jemalloc
       -I/opt/chapel//lib/chapel/1.16/third-party/utf8-decoder
       -I/opt/chapel/share/chapel/1.16/runtime//../build/runtime/linux64/gnu/arch-native/loc-flat/comm-none/tasks-qthreads/tmr-generic/unwind-none/mem-jemalloc/atomics-intrinsics/gmp/hwloc/re2/wide-struct/fs-none/include
       -I/opt/chapel//lib/chapel/1.16/third-party/jemalloc/install/linux64-gnu-native/include
       -I/opt/chapel//lib/chapel/1.16/third-party/gmp/install/linux64-gnu-native/include
       -I/opt/chapel//lib/chapel/1.16/third-party/hwloc/install/linux64-gnu-native-flat/include /tmp/chpl-runner-15040.deleteme/_main.c

g++    -L/opt/chapel//lib/chapel/1.16/third-party/qthread/install/linux64-gnu-native-flat/lib
       -Wl,-rpath,/opt/chapel//lib/chapel/1.16/third-party/qthread/install/linux64-gnu-native-flat/lib
       -L/opt/chapel//lib/chapel/1.16/third-party/jemalloc/install/linux64-gnu-native/lib
       -L/opt/chapel//lib/chapel/1.16/third-party/gmp/install/linux64-gnu-native/lib
       -Wl,-rpath,/opt/chapel//lib/chapel/1.16/third-party/gmp/install/linux64-gnu-native/lib
       -L/opt/chapel//lib/chapel/1.16/third-party/hwloc/install/linux64-gnu-native-flat/lib
       -Wl,-rpath,/opt/chapel//lib/chapel/1.16/third-party/hwloc/install/linux64-gnu-native-flat/lib
       -L/opt/chapel//lib/chapel/1.16/third-party/re2/install/linux64-gnu-native/lib
       -Wl,-rpath,/opt/chapel//lib/chapel/1.16/third-party/re2/install/linux64-gnu-native/lib
       -o /tmp/chpl-runner-15040.deleteme/.bin.tio.tmp
       -L/opt/chapel//lib/chapel/1.16/runtime/lib/linux64/gnu/arch-native/loc-flat/comm-none/tasks-qthreads/tmr-generic/unwind-none/mem-jemalloc/atomics-intrinsics/gmp/hwloc/re2/wide-struct/fs-none
       /tmp/chpl-runner-15040.deleteme/.bin.tio.tmp.o
       /opt/chapel//lib/chapel/1.16/runtime/lib/linux64/gnu/arch-native/loc-flat/comm-none/tasks-qthreads/tmr-generic/unwind-none/mem-jemalloc/atomics-intrinsics/gmp/hwloc/re2/wide-struct/fs-none/main.o
       -lchpl
       -lm
       -lblas -L/usr/lib64/atlas
       -ltatlas
       -lgmp
       -lchpl
       -lqthread -L/opt/chapel//lib/chapel/1.16/third-party/hwloc/install/linux64-gnu-native-flat/lib
       -L/opt/chapel//lib/chapel/1.16/third-party/jemalloc/install/linux64-gnu-native/lib
       -ljemalloc
       -lhwloc
       -lm
       -lre2
       -lpthread

最后,但并非最不重要,让我分享
一些关于性能数据,设置开销的一些最终评论以及一系列可通过实验探索的内容边界

虽然最大的 [PAR] 权限超出了可测试范围(在公共赞助的<TiO>-IDE基础架构上由于明显原因而在管理上不可用)并且可能会在更现实的计算设备上得到进一步的调查,例如Cray的内部资源中可用的那些,可由Cray的Chapel-initiative使用和使用,两者的好处语言表达能力和语言实施的实际状态令人印象深刻。

要调查的其他一些问题可能是:

致谢

再次感谢Dennis @ <TiO>-IDE支持&amp; Brad @ Cray + 团队的最佳人选,推动和扩展这个优秀的软件项目仍然越来越好。

writeln(// "______________________________________ChplCode.<-lsatlas> implementation___________________________________ SERIAL-MODE [ATLAS] SUPPORT FOR [LinearAlgebra] MODULE" );
           "______________________________________ChplCode.<-ltatlas> implementation_________________________________ THREADED-MODE [ATLAS] SUPPORT FOR [LinearAlgebra] MODULE" );
        /* 
           As the experimentally collected performance-data show and support below,
           there is about a constant,
           Matrix scale-invariant,
           additional overhead of ~ +440 ~ +500 [ms]
           for
           a THREADED-MODE [ATLAS] SUPPORT FOR [LinearAlgebra] MODULE,
           believed to be
           associated with a setup of a thread-pool & al processing pre-arrangements,
           which
           ought be accounted for in
           an overhead-aware Amdahl Law formulation for pre-validations of a feasible choice
           whether a [PAR], using -ltatlas
           or      a [SEQ], using -lsatlas support for the [LinearAlgebra] module implementation
           will yield faster processing times.
           */

                    use Time;

var aStopWATCH_RND_GEN: Time.Timer;
var aStopWATCH_LIN_ALG: Time.Timer;
var aStopWATCH_MAT_REC: Time.Timer;
var aStopWATCH_ARR_REC: Time.Timer;

config const n_power =         5;
config const L_size  =      2600;
       const indices = 1..L_size;
       const aDomain = {indices, indices};

       var   A: [aDomain] real(64);
       var   B: [aDomain] real(64);
       const dtype =    "-real(64)";
       var   S: [aDomain] real(64);

use Random;
/* ---------------------------------------------SECTION-UNDER-TEST--*/     aStopWATCH_RND_GEN.start();
    Random.fillRandom(  A );
    Random.fillRandom(  B );
/* ---------------------------------------------SECTION-UNDER-TEST--*/     aStopWATCH_RND_GEN.stop();
writeln( ".fillRandom()        took ",                                     aStopWATCH_RND_GEN.elapsed( Time.TimeUnits.microseconds ),
         " [us] for A[,], B[,] having ", 2 * ( L_size * L_size ), dtype, " elements in total." );

     use LinearAlgebra;
var mA = LinearAlgebra.Matrix( A );
var mB = LinearAlgebra.Matrix( B );
var mS = LinearAlgebra.Matrix( S );

/* ---------------------------------------------SECTION-UNDER-TEST--*/     aStopWATCH_LIN_ALG.start();
    mS = LinearAlgebra.matPlus( LinearAlgebra.matPow( mA, n_power ),
                                LinearAlgebra.matPow( mB, n_power )
                                );
/* ---------------------------------------------SECTION-UNDER-TEST--*/     aStopWATCH_LIN_ALG.stop();
/* |||||||||||||||||||||||||||||||||||||||||||||||||||||||||| PERF--*/

writeln( ".fillRandom()        took ", aStopWATCH_RND_GEN.elapsed( Time.TimeUnits.microseconds ), " [us] for A[,], B[,] having ", 2 * ( L_size * L_size ), dtype, " elements in total." );
writeln(
       "\n<SECTION-UNDER-TEST> took ", aStopWATCH_LIN_ALG.elapsed( Time.TimeUnits.microseconds ), " [us] in [LIN_ALG] mode ( A^n + B^n ) for [", L_size, ",", L_size, "] on <TiO>-IDE"
   // ,"\n<SECTION-UNDER-TEST> took ", aStopWATCH_MAT_REC.elapsed( Time.TimeUnits.microseconds ), " [us] in [MAT_REC] mode ( A^n + B^n ) for [", L_size, ",", L_size, "] on <TiO>-IDE"
   // ,"\n<SECTION-UNDER-TEST> took ", aStopWATCH_ARR_REC.elapsed( Time.TimeUnits.microseconds ), " [us] in [ARR_REC] mode ( A^n + B^n ) for [", L_size, ",", L_size, "] on <TiO>-IDE"
        );
/* ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| INF--*/

writeln(                     "<TiO>-IDE-LocaleSpace is: ", LocaleSpace, " massive. Code is executing [here], being Locale ", here.id  );
for                                                i in    LocaleSpace do
    writeln(                 "          Locale #", i, "'s ID is: ", Locales[i].id,                                            "\n                                having a name of <_",
                                                                    Locales[i].name,                                        "_>\n                                having { REAL:"              ,
                                                                 // Locales[i].numPUs( logical = false, accessible =  true ),   " | VIRT:"                       ,
                                                                    Locales[i].numPUs(           false,               true ),   " | VIRT:"                       ,
                                                                 // Locales[i].numPUs( logical =  true, accessible =  true ),   " | TEOR:"                       ,
                                                                    Locales[i].numPUs(            true,               true ),   " | TEOR:"                       ,
                                                                 // Locales[i].numPUs( logical =  true, accessible = false ),   " } PUnits"                      ,
                                                                    Locales[i].numPUs(            true,              false ),   " } PUnits\n                                having max ",
                                                                    Locales[i].maxTaskPar,                                      " 'just'-[CONCURRENT]-tasks\n                                having max ",
                                                                    Locales[i].callStackSize,                                   "-callStackSIZE."
                                                                    );

/* ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| RES:

.fillRandom()        took      560773 [us] for A[,], B[,] having  2000000-real(64) elements in total. <BEST-CASE>s IN SERIAL-MODE [ATLAS] SUPPORT FOR [LinearAlgebra] MODULE
.fillRandom()        took     2521920 [us] for A[,], B[,] having  8000000-real(64) elements in total.
.fillRandom()        took     2717450 [us] for A[,], B[,] having  9680000-real(64) elements in total.
.fillRandom()        took     3630820 [us] for A[,], B[,] having 11520000-real(64) elements in total.

.fillRandom()        took     4429820 [us] for A[,], B[,] having 13520000-real(64) elements in total.
.fillRandom()        took     4048440 [us] for A[,], B[,] having 13520000-real(64) elements in total. ( IN THREADED-MODE ) was faster, but not systematically

.fillRandom()        took     4793110 [us] for A[,], B[,] having 15680000-real(64) elements in total. 
.fillRandom()        took     5055060 [us] for A[,], B[,] having 15680000-real(64) elements in total. ( IN THREADED-MODE )

.fillRandom()        took     5630540 [us] for A[,], B[,] having 18000000-real(64) elements in total.



<TiO>-IDE-LocaleSpace is: {0..0} massive. Code is executing [here], being Locale 0
          Locale #0's ID is: 0
                                having a name of <_tio2_>
                                having { REAL:1 | VIRT:1 | TEOR:1 } PUnits
                                having max 4 'just'-[CONCURRENT]-tasks
                                having max 8388608-callStackSIZE.

<SECTION-UNDER-TEST> took    15110000 [us] in [LIN_ALG] mode ( A^n + B^n ) for [2000,2000] on <TiO>-IDE <BEST-CASE>s IN SERIAL-MODE
<SECTION-UNDER-TEST> took    17880300 [us] in [LIN_ALG] mode ( A^n + B^n ) for [2200,2200] on <TiO>-IDE
<SECTION-UNDER-TEST> took    25094100 [us] in [LIN_ALG] mode ( A^n + B^n ) for [2400,2400] on <TiO>-IDE
<SECTION-UNDER-TEST> took    31550900 [us] in [LIN_ALG] mode ( A^n + B^n ) for [2600,2600] on <TiO>-IDE
<SECTION-UNDER-TEST> took    32996500 [us] in [LIN_ALG] mode ( A^n + B^n ) for [2600,2600] on <TiO>-IDE
<SECTION-UNDER-TEST> took    34390400 [us] in [LIN_ALG] mode ( A^n + B^n ) for [2800,2800] on <TiO>-IDE
<SECTION-UNDER-TEST> KILL-ed                                               for [3000,3000] on <TiO>-IDE, having 18,000,000-real(64) elements .fillRandom()-ed in ~ 5.6 [s] time.

______________________________________ChplCode.<-lsatlas> implementation___________________________________ SERIAL-MODE [ATLAS] SUPPORT FOR [LinearAlgebra] MODULE
                                                  ^________________________________________________________ SERIAL-MODE [ATLAS] SUPPORT FOR [LinearAlgebra] MODULE
.fillRandom()        took 5.60773e+05 [us] for A[,], B[,] having 2000000-real(64) elements in total.
.fillRandom()        took 5.62970e+05 [us] for A[,], B[,] having 2000000-real(64) elements in total.
.fillRandom()        took 5.64366e+05 [us] for A[,], B[,] having 2000000-real(64) elements in total.
.fillRandom()        took 5.70291e+05 [us] for A[,], B[,] having 2000000-real(64) elements in total.
.fillRandom()        took 5.75086e+05 [us] for A[,], B[,] having 2000000-real(64) elements in total.
.fillRandom()        took 5.85121e+05 [us] for A[,], B[,] having 2000000-real(64) elements in total.
.fillRandom()        took 6.25645e+05 [us] for A[,], B[,] having 2000000-real(64) elements in total.
.fillRandom()        took 6.77903e+05 [us] for A[,], B[,] having 2000000-real(64) elements in total.
.fillRandom()        took 6.96932e+05 [us] for A[,], B[,] having 2000000-real(64) elements in total.
.fillRandom()        took 6.98700e+05 [us] for A[,], B[,] having 2000000-real(64) elements in total.

<SECTION-UNDER-TEST> took 2.06538e+06 [us] in [LIN_ALG] mode ( A^n + B^n ) for [1000,1000] on <TiO>-IDE
<SECTION-UNDER-TEST> took 2.07902e+06 [us] in [LIN_ALG] mode ( A^n + B^n ) for [1000,1000] on <TiO>-IDE
<SECTION-UNDER-TEST> took 2.08725e+06 [us] in [LIN_ALG] mode ( A^n + B^n ) for [1000,1000] on <TiO>-IDE
<SECTION-UNDER-TEST> took 2.12497e+06 [us] in [LIN_ALG] mode ( A^n + B^n ) for [1000,1000] on <TiO>-IDE
<SECTION-UNDER-TEST> took 2.13071e+06 [us] in [LIN_ALG] mode ( A^n + B^n ) for [1000,1000] on <TiO>-IDE
<SECTION-UNDER-TEST> took 2.22075e+06 [us] in [LIN_ALG] mode ( A^n + B^n ) for [1000,1000] on <TiO>-IDE
<SECTION-UNDER-TEST> took 2.28035e+06 [us] in [LIN_ALG] mode ( A^n + B^n ) for [1000,1000] on <TiO>-IDE
<SECTION-UNDER-TEST> took 2.32674e+06 [us] in [LIN_ALG] mode ( A^n + B^n ) for [1000,1000] on <TiO>-IDE
<SECTION-UNDER-TEST> took 2.33844e+06 [us] in [LIN_ALG] mode ( A^n + B^n ) for [1000,1000] on <TiO>-IDE
<SECTION-UNDER-TEST> took 2.35908e+06 [us] in [LIN_ALG] mode ( A^n + B^n ) for [1000,1000] on <TiO>-IDE


______________________________________ChplCode.<-ltatlas> implementation_________________________________ THREADED-MODE [ATLAS] SUPPORT FOR [LinearAlgebra] MODULE
                                                  ^______________________________________________________ THREADED-MODE [ATLAS] SUPPORT FOR [LinearAlgebra] MODULE
.fillRandom()        took 5.68652e+05 [us] for A[,], B[,] having 2000000-real(64) elements in total.
.fillRandom()        took 5.73797e+05 [us] for A[,], B[,] having 2000000-real(64) elements in total.
.fillRandom()        took 5.74911e+05 [us] for A[,], B[,] having 2000000-real(64) elements in total.
.fillRandom()        took 5.81389e+05 [us] for A[,], B[,] having 2000000-real(64) elements in total.
.fillRandom()        took 5.87079e+05 [us] for A[,], B[,] having 2000000-real(64) elements in total.
.fillRandom()        took 5.92182e+05 [us] for A[,], B[,] having 2000000-real(64) elements in total.
.fillRandom()        took 6.20989e+05 [us] for A[,], B[,] having 2000000-real(64) elements in total.
.fillRandom()        took 6.62606e+05 [us] for A[,], B[,] having 2000000-real(64) elements in total.
.fillRandom()        took 6.69875e+05 [us] for A[,], B[,] having 2000000-real(64) elements in total.
.fillRandom()        took 6.71270e+05 [us] for A[,], B[,] having 2000000-real(64) elements in total.

<SECTION-UNDER-TEST> took 2.53459e+06 [us] in [LIN_ALG] mode ( A^n + B^n ) for [1000,1000] on <TiO>-IDE
<SECTION-UNDER-TEST> took 2.57695e+06 [us] in [LIN_ALG] mode ( A^n + B^n ) for [1000,1000] on <TiO>-IDE
<SECTION-UNDER-TEST> took 2.59966e+06 [us] in [LIN_ALG] mode ( A^n + B^n ) for [1000,1000] on <TiO>-IDE
<SECTION-UNDER-TEST> took 2.61859e+06 [us] in [LIN_ALG] mode ( A^n + B^n ) for [1000,1000] on <TiO>-IDE
<SECTION-UNDER-TEST> took 2.70356e+06 [us] in [LIN_ALG] mode ( A^n + B^n ) for [1000,1000] on <TiO>-IDE
<SECTION-UNDER-TEST> took 2.76325e+06 [us] in [LIN_ALG] mode ( A^n + B^n ) for [1000,1000] on <TiO>-IDE
<SECTION-UNDER-TEST> took 2.85588e+06 [us] in [LIN_ALG] mode ( A^n + B^n ) for [1000,1000] on <TiO>-IDE
<SECTION-UNDER-TEST> took 2.92058e+06 [us] in [LIN_ALG] mode ( A^n + B^n ) for [1000,1000] on <TiO>-IDE
<SECTION-UNDER-TEST> took 2.92204e+06 [us] in [LIN_ALG] mode ( A^n + B^n ) for [1000,1000] on <TiO>-IDE
<SECTION-UNDER-TEST> took 2.97887e+06 [us] in [LIN_ALG] mode ( A^n + B^n ) for [1000,1000] on <TiO>-IDE


<TiO>-IDE-LocaleSpace is: {0..0} massive.             +--------------------------------<_tio2_>
Code is executing [here], being Locale 0              V
          Locale #0's ID is: 0, having a name of <_tio2_>
                                having { REAL:1 | VIRT:1 | TEOR:1 } PUnits,
                                having max 4 just-[CONCURENT]-tasks,
                                having 8388608-callStackSIZE.

                                                      +--------------------------------<_tio3_>
                                                      V
...                             having a name of <_tio3_>
                                having { REAL:1 | VIRT:1 | TEOR:1 } PUnits
                                having max 4 'just'-[CONCURRENT]-tasks
                                having max 8388608-callStackSIZE.


*/