使用LAPACK-BLAS DGEMM的矩阵点积

时间:2016-09-02 14:53:16

标签: scala matrix-multiplication lapack blas

使用lapack-blas dgemm函数,我们试图得到这些矩阵的点积

    A = Array(Array(0.7266678772119796, 0.37866742996700287, 0.011693659632231124),
Array(0.09987886438245919, 0.3676551935579567, 0.6323601372667774))

    B = Array(Array(0.1539391703466485, 0.8259866297685163, 0.14377752901280771, 0.7412313835216213), 
Array(0.1415314251516353, 0.6226998769259113, 0.22445999933643912, 0.2190218035735153), 
Array(0.8696518309547832, 0.6548401943199273, 0.7637877932908158, 0.14197100882023972))

我们使用了以下参数:

    val A_ = A.flatten
    val B_ = B.flatten
    val m = A.size
    val k1 = A(0).size
    val k2 = B.size
    val n = B(0).size
    require(k1 == k2, "number of columns in A must match number of rows in B")
    var C = Array.fill[Double](m*n)(0.0)
    blas.dgemm("N", "N", m, n, k2, 1.0, A_ , m, B_ , k2, 1.0, C, m)

这给了我们一个错误的结果。我们期望点积为:

    Array(Array(0.17562540366704119, 0.8436715912415502, 0.19840567736364106, 0.6232256201072643), 
Array(0.6173431842237301, 0.7255322855240385, 0.5798731746316419, 0.2443346590424818))

但它给了我们以下价值观:

 Array(Array(0.34876402380669536, 1.5384458001585097, 0.9708020997951017, 1.0739583742659222), 
     Array(0.4634190691304188, 1.3771735213529386, 1.3136089825838326, 0.8280594349415209))

知道价值观错误的原因吗?

1 个答案:

答案 0 :(得分:0)

1)展平似乎是按行遍历元素,但是Fortran希望按列遍历元素,因此此矩阵的确成倍增加:

    A2 = Array(Array(0.7266678772119796, 0.011693659632231124, 0.3676551935579567),
Array(0.37866742996700287, 0.09987886438245919, 0.6323601372667774))

    B2 = Array(Array(0.1539391703466485, 0.7412313835216213, 0.22445999933643912, 0.6548401943199273), 
Array(0.8259866297685163, 0.14377752901280771, 0.2190218035735153, 0.7637877932908158), 
Array(0.1415314251516353, 0.6226998769259113, 0.8696518309547832, 0.14197100882023972))

2)这会导致错误的结果:

    Array(Array(0.1743820119033477, 0.7692229000792548, 0.4854010498975508, 0.5369791871329611),
Array(0.2317095345652094, 0.6885867606764693, 0.6568044912919163, 0.4140297174707604))

3),并且此错误结果以某种方式翻倍(m值泄漏到alpha值?),产生了您正在观察的结果。