我在这里尝试Apple Metal矩阵乘法样本: https://developer.apple.com/library/ios/samplecode/MetalPartialSumsCompute/Introduction/Intro.html
我得到了奇怪的结果:对于测试[1] - [7],我让Metal运行在0.05 GFlops附近。从测试[8] - [20]开始,Metal在500 GFlops的速度下开始变得非常快。我附上下面的日志。我查看了代码,测试之间没有什么不同,它们都是类似大小的随机矩阵。看起来金属在某些时候开始无缘无故地快速前进。有什么想法正在发生什么?
日志:
2016-06-30 16:13:29.609 MetalMatrixMultiplication-iOS[3459:742844] >> [1] Matrix Dimensions: A = [841 x 2012], B = [2012 x 554], C = [841 x 554], lda = 848, ldb = 560, ldc = 560
>> [1] Accelerate 6.934929 gflops/sec, Metal 0.044756 gflops/sec, Accelerate 27.034708 millisecs, Metal 4189.027417 millisecs, Diff 1.369554e-01
2016-06-30 16:13:31.747 MetalMatrixMultiplication-iOS[3459:742844] >> [2] Matrix Dimensions: A = [721 x 432], B = [432 x 1436], C = [721 x 1436], lda = 728, ldb = 1440, ldc = 1440
>> [2] Accelerate 1.405928 gflops/sec, Metal 0.045415 gflops/sec, Accelerate 63.626833 millisecs, Metal 1969.722500 millisecs, Diff 4.248900e-02
2016-06-30 16:13:34.820 MetalMatrixMultiplication-iOS[3459:742844] >> [3] Matrix Dimensions: A = [1362 x 457], B = [457 x 1078], C = [1362 x 1078], lda = 1368, ldb = 1080, ldc = 1080
>> [3] Accelerate 1.754547 gflops/sec, Metal 0.046793 gflops/sec, Accelerate 76.485125 millisecs, Metal 2867.863083 millisecs, Diff 3.673622e-02
2016-06-30 16:13:45.549 MetalMatrixMultiplication-iOS[3459:742844] >> [4] Matrix Dimensions: A = [1783 x 1901], B = [1901 x 1347], C = [1783 x 1347], lda = 1784, ldb = 1352, ldc = 1352
>> [4] Accelerate 6.528442 gflops/sec, Metal 0.091166 gflops/sec, Accelerate 139.869000 millisecs, Metal 10016.091333 millisecs, Diff 5.854867e-02
2016-06-30 16:13:48.912 MetalMatrixMultiplication-iOS[3459:742844] >> [5] Matrix Dimensions: A = [709 x 600], B = [600 x 1683], C = [709 x 1683], lda = 712, ldb = 1688, ldc = 1688
>> [5] Accelerate 2.629253 gflops/sec, Metal 0.045250 gflops/sec, Accelerate 54.460208 millisecs, Metal 3164.426333 millisecs, Diff 4.654048e-02
2016-06-30 16:13:57.534 MetalMatrixMultiplication-iOS[3459:742844] >> [6] Matrix Dimensions: A = [636 x 1573], B = [1573 x 1942], C = [636 x 1942], lda = 640, ldb = 1944, ldc = 1944
>> [6] Accelerate 7.106906 gflops/sec, Metal 0.047387 gflops/sec, Accelerate 54.674458 millisecs, Metal 8199.887292 millisecs, Diff 7.446345e-02
2016-06-30 16:14:10.669 MetalMatrixMultiplication-iOS[3459:742844] >> [7] Matrix Dimensions: A = [1803 x 1689], B = [1689 x 1950], C = [1803 x 1950], lda = 1808, ldb = 1952, ldc = 1952
>> [7] Accelerate 6.759199 gflops/sec, Metal 0.096267 gflops/sec, Accelerate 175.709292 millisecs, Metal 12337.145375 millisecs, Diff 4.568898e-02
2016-06-30 16:14:10.878 MetalMatrixMultiplication-iOS[3459:742844] >> [8] Matrix Dimensions: A = [416 x 749], B = [749 x 2034], C = [416 x 2034], lda = 416, ldb = 2040, ldc = 2040
>> [8] Accelerate 3.589321 gflops/sec, Metal 220.343105 gflops/sec, Accelerate 35.313750 millisecs, Metal 0.575250 millisecs, Diff 0.000000e+00
2016-06-30 16:14:11.003 MetalMatrixMultiplication-iOS[3459:742844] >> [9] Matrix Dimensions: A = [657 x 716], B = [716 x 734], C = [657 x 734], lda = 664, ldb = 736, ldc = 736
>> [9] Accelerate 2.946337 gflops/sec, Metal 102.394388 gflops/sec, Accelerate 23.438083 millisecs, Metal 0.674417 millisecs, Diff 0.000000e+00
2016-06-30 16:14:11.124 MetalMatrixMultiplication-iOS[3459:742844] >> [10] Matrix Dimensions: A = [446 x 945], B = [945 x 707], C = [446 x 707], lda = 448, ldb = 712, ldc = 712
>> [10] Accelerate 3.426099 gflops/sec, Metal 94.259957 gflops/sec, Accelerate 17.394667 millisecs, Metal 0.632250 millisecs, Diff 0.000000e+00
2016-06-30 16:14:11.533 MetalMatrixMultiplication-iOS[3459:742844] >> [11] Matrix Dimensions: A = [935 x 1286], B = [1286 x 1899], C = [935 x 1899], lda = 936, ldb = 1904, ldc = 1904
>> [11] Accelerate 6.185983 gflops/sec, Metal 441.997324 gflops/sec, Accelerate 73.824208 millisecs, Metal 1.033208 millisecs, Diff 0.000000e+00
2016-06-30 16:14:11.685 MetalMatrixMultiplication-iOS[3459:742844] >> [12] Matrix Dimensions: A = [541 x 956], B = [956 x 960], C = [541 x 960], lda = 544, ldb = 960, ldc = 960
>> [12] Accelerate 3.805037 gflops/sec, Metal 153.253113 gflops/sec, Accelerate 26.097417 millisecs, Metal 0.647958 millisecs, Diff 0.000000e+00
2016-06-30 16:14:12.007 MetalMatrixMultiplication-iOS[3459:742844] >> [13] Matrix Dimensions: A = [1278 x 1809], B = [1809 x 500], C = [1278 x 500], lda = 1280, ldb = 504, ldc = 504
>> [13] Accelerate 7.661287 gflops/sec, Metal 343.033372 gflops/sec, Accelerate 30.176417 millisecs, Metal 0.673958 millisecs, Diff 0.000000e+00
2016-06-30 16:14:12.456 MetalMatrixMultiplication-iOS[3459:742844] >> [14] Matrix Dimensions: A = [1933 x 1534], B = [1534 x 805], C = [1933 x 805], lda = 1936, ldb = 808, ldc = 808
>> [14] Accelerate 7.221810 gflops/sec, Metal 696.681127 gflops/sec, Accelerate 66.105417 millisecs, Metal 0.685250 millisecs, Diff 0.000000e+00
2016-06-30 16:14:12.552 MetalMatrixMultiplication-iOS[3459:742844] >> [15] Matrix Dimensions: A = [291 x 645], B = [645 x 1034], C = [291 x 1034], lda = 296, ldb = 1040, ldc = 1040
>> [15] Accelerate 2.155479 gflops/sec, Metal 62.162540 gflops/sec, Accelerate 18.007750 millisecs, Metal 0.624417 millisecs, Diff 0.000000e+00
2016-06-30 16:14:12.940 MetalMatrixMultiplication-iOS[3459:742844] >> [16] Matrix Dimensions: A = [1656 x 1547], B = [1547 x 781], C = [1656 x 781], lda = 1656, ldb = 784, ldc = 784
>> [16] Accelerate 7.341706 gflops/sec, Metal 424.495925 gflops/sec, Accelerate 54.504792 millisecs, Metal 0.942667 millisecs, Diff 0.000000e+00
2016-06-30 16:14:13.425 MetalMatrixMultiplication-iOS[3459:742844] >> [17] Matrix Dimensions: A = [1651 x 1320], B = [1320 x 1429], C = [1651 x 1429], lda = 1656, ldb = 1432, ldc = 1432
>> [17] Accelerate 6.615108 gflops/sec, Metal 1001.902932 gflops/sec, Accelerate 94.155625 millisecs, Metal 0.621667 millisecs, Diff 0.000000e+00
2016-06-30 16:14:13.757 MetalMatrixMultiplication-iOS[3459:742844] >> [18] Matrix Dimensions: A = [2037 x 384], B = [384 x 1615], C = [2037 x 1615], lda = 2040, ldb = 1616, ldc = 1616
>> [18] Accelerate 1.737157 gflops/sec, Metal 331.366545 gflops/sec, Accelerate 145.440583 millisecs, Metal 0.762458 millisecs, Diff 0.000000e+00
2016-06-30 16:14:13.923 MetalMatrixMultiplication-iOS[3459:742844] >> [19] Matrix Dimensions: A = [795 x 677], B = [677 x 1145], C = [795 x 1145], lda = 800, ldb = 1152, ldc = 1152
>> [19] Accelerate 3.405232 gflops/sec, Metal 192.017503 gflops/sec, Accelerate 36.194667 millisecs, Metal 0.641875 millisecs, Diff 0.000000e+00
2016-06-30 16:14:14.033 MetalMatrixMultiplication-iOS[3459:742844] >> [20] Matrix Dimensions: A = [1062 x 438], B = [438 x 678], C = [1062 x 678], lda = 1064, ldb = 680, ldc = 680
>> [20] Accelerate 2.090133 gflops/sec, Metal 98.388385 gflops/sec, Accelerate 30.177583 millisecs, Metal 0.641083 millisecs, Diff 0.000000e+00
答案 0 :(得分:0)
正在发生的是操作失败,但是演示代码没有检查状态,因此它看起来好像运行得更快。
如果添加此块
if (m_CmdBuffer.status == MTLCommandBufferStatusError) {
NSLog(@"Error occured when executing command buffer");
NSLog(@"Error code: %@", mCmdBuffer.error);
}
在MetalMatrixMult完成方法结束时(MetalMatrixMult.mm第513行),您将看到错误发生的时间。
首先失败的是: 错误代码:
错误域= MTLCommandBufferErrorDomain代码= 2“导致GPU超时错误(IOAF代码2)”UserInfo = {NSLocalizedDescription =导致GPU超时错误(IOAF代码2)}
然后,在报告了几个之后:错误代码:错误域= MTLCommandBufferErrorDomain代码= 4“忽略(导致先前/过多的GPU错误)(IOAF代码4)”UserInfo = {NSLocalizedDescription =忽略(导致先前/过多的GPU错误)(IOAF代码4) }
我在iOS 9上注意到Metal的另一件事是,当GPU帧捕获和金属API验证打开时,似乎存在内存管理错误(编辑方案 - >选项选项卡)。就好像在这种模式下运行时没有释放金属缓冲区一样。