我正在尝试在MacOS上使用Metal Performance Shaders执行一个简单的DepthwiseConvolution内核,并遇到问题。首先,我初始化一个MPSImage
(称为debugInputImage
),并用适当的大小填充一些数字,例如1.0
。然后创建卷积内核:
convolution_depthwise_0 = MPSCNNConvolution(device: device,
weights: datasource_depthwise_0)
datasource_depthwise_0
是MPSCNNConvolutionDataSource
的实例,具有以下描述符:
func descriptor() -> MPSCNNConvolutionDescriptor {
var desc = MPSCNNDepthWiseConvolutionDescriptor(kernelWidth: 3,
kernelHeight: 3,
inputFeatureChannels: 32,
outputFeatureChannels: 32)
return desc
}
这是我初始化输入图像的方式:
let imageDescriptor = MPSImageDescriptor(channelFormat: .float16,
width: 256, height: 256, featureChannels: 32)
debugInputImage = MPSImage(device: device,
imageDescriptor: imageDescriptor)
var arrayOfOnes = Array(repeating: Float(1.0),
count: imageDescriptor.width * imageDescriptor.height
* imageDescriptor.featureChannels)
let arrayOfOnes16 = toFloat16(&arrayOfOnes, size: arrayOfOnes.count)
debugInputImage.writeBytes(arrayOfOnes16,
dataLayout: MPSDataLayout.HeightxWidthxFeatureChannels, imageIndex: 0)
当我运行所有这些命令时:
let commandBuffer = commandQueue.makeCommandBuffer()!
let outImage = convolution_depthwise_0.encode(commandBuffer: commandBuffer,
sourceImage: debugInputImage)
并得到此错误(在let outImage = convolution_depthwise_0.encode(...
行):
validateComputeFunctionArguments:860: failed assertion `Compute
Function(depthwiseConvolution): missing threadgroupMemory binding
at index 0 for lM[0].'
对于常规卷积,一切都很好,仅对于深度方向,我会遇到此问题。
该错误的原因可能是什么?
系统:MacOS 10.14,XCode 10.1 beta 3
仅MPSCNNDepthWiseConvolutionDescriptor不起作用。 MPSCNNConvolutionDescriptor没有问题。我在iOS上也没有问题,只有Mac OS。