函数(depthwiseConvolution):lM [0]在索引0处缺少threadgroupMemory绑定

时间:2018-11-07 16:55:30

标签: metal metal-performance-shaders

我正在尝试在MacOS上使用Metal Performance Shaders执行一个简单的DepthwiseConvolution内核,并遇到问题。首先,我初始化一个MPSImage(称为debugInputImage),并用适当的大小填充一些数字,例如1.0。然后创建卷积内核:

convolution_depthwise_0 = MPSCNNConvolution(device: device, 
    weights: datasource_depthwise_0)

datasource_depthwise_0MPSCNNConvolutionDataSource的实例,具有以下描述符:

func descriptor() -> MPSCNNConvolutionDescriptor {
    var desc = MPSCNNDepthWiseConvolutionDescriptor(kernelWidth: 3,
        kernelHeight: 3,
        inputFeatureChannels: 32,
        outputFeatureChannels: 32)
    return desc
}

这是我初始化输入图像的方式:

let imageDescriptor = MPSImageDescriptor(channelFormat: .float16, 
    width: 256, height: 256, featureChannels: 32)

debugInputImage = MPSImage(device: device, 
    imageDescriptor: imageDescriptor)

var arrayOfOnes = Array(repeating: Float(1.0), 
    count: imageDescriptor.width * imageDescriptor.height 
        * imageDescriptor.featureChannels)


let arrayOfOnes16 = toFloat16(&arrayOfOnes, size: arrayOfOnes.count)

debugInputImage.writeBytes(arrayOfOnes16, 
    dataLayout: MPSDataLayout.HeightxWidthxFeatureChannels, imageIndex: 0)

当我运行所有这些命令时:

let commandBuffer = commandQueue.makeCommandBuffer()!
let outImage = convolution_depthwise_0.encode(commandBuffer: commandBuffer, 
    sourceImage: debugInputImage)

并得到此错误(在let outImage = convolution_depthwise_0.encode(...行):

validateComputeFunctionArguments:860: failed assertion `Compute 
Function(depthwiseConvolution): missing threadgroupMemory binding 
at index 0 for lM[0].'

对于常规卷积,一切都很好,仅对于深度方向,我会遇到此问题。

该错误的原因可能是什么?

系统:MacOS 10.14,XCode 10.1 beta 3

仅MPSCNNDepthWiseConvolutionDescriptor不起作用。 MPSCNNConvolutionDescriptor没有问题。我在iOS上也没有问题,只有Mac OS。

0 个答案:

没有答案