在WWDC会议510中,Apple工程师在CIKernel
中提供对Metal
编码的支持,并声称它应该更快地运行。
我在metal
和glsl
实现了动作模糊a test project(代码类似于510会话中的代码)。
有时metal kernel
更快,有时glsl kernel
更快,但我绝对看不到metal kernel
执行一致性,并且明显更好。它应该是这样的,我错过了什么吗?
注意:项目不能在模拟器上运行,您需要A8 +驱动的设备。
答案 0 :(得分:0)
看起来其中一些与硬件有关。这是我的iPad Pro 10.5英寸结果:
cd C:\dev\solr-6.5.0
.\bin\solr -e cloud
我的iPhoneSE结果:
glsl 1 took 229.572057723999ms
glsl 2 took 49.1310358047485ms
glsl 3 took 46.7269420623779ms
glsl 4 took 53.08997631073ms
glsl 5 took 48.9979982376099ms
glsl 6 took 49.0390062332153ms
glsl 7 took 52.5139570236206ms
glsl 8 took 46.4930534362793ms
glsl 9 took 39.6310091018677ms
glsl 10 took 45.9860563278198ms
metal 1 took 77.7549743652344ms
metal 2 took 44.1800355911255ms
metal 3 took 46.0859537124634ms
metal 4 took 45.3709363937378ms
metal 5 took 43.5279607772827ms
metal 6 took 38.9848947525024ms
metal 7 took 37.1809005737305ms
metal 8 took 37.8340482711792ms
metal 9 took 37.6850366592407ms
metal 10 took 37.5720262527466ms
一个问题和想法: