金属内核着色器无法正常工作

时间:2017-07-14 23:16:05

标签: metal

我很困惑为什么我的内核着色器无效。

我有真正的RGBA32像素缓冲区(inBuffer),我发送给内核着色器。我还有一个接收MTLTexture,我在其RGBA8Norm描述符中设置了MTLTextureUsageRenderTarget的用法。

然后我就这样发送编码......

id<MTLLibrary> library = [_device newDefaultLibrary];
id<MTLFunction> kernelFunction = [library newFunctionWithName:@"stripe_Kernel"];
id<MTLComputePipelineState> pipeline = [_device newComputePipelineStateWithFunction:kernelFunction error:&error];
id<MTLCommandQueue> commandQueue = [_device newCommandQueue];
MTLTextureDescriptor *textureDescription = [MTLTextureDescriptor texture2DDescriptorWithPixelFormat:MTLPixelFormatRGBA8Unorm
                                                                                              width:outputSize.width
                                                                                             height:outputSize.height
                                                                                          mipmapped:NO];
[textureDescription setUsage:MTLTextureUsageRenderTarget];
id<MTLTexture> metalTexture = [_device newTextureWithDescriptor:textureDescription];

MTLSize threadgroupCounts = MTLSizeMake(8, 8, 1);
MTLSize threadgroups = MTLSizeMake([metalTexture width] / threadgroupCounts.width,
                                   [metalTexture height] / threadgroupCounts.height, 1);

...

id<MTLBuffer> metalBuffer = [_device newBufferWithBytesNoCopy:inBuffer
                                                       length:inputByteCount
                                                       options:MTLResourceStorageModeShared
                                                      deallocator:nil];

    [commandEncoder setComputePipelineState:pipeline];
    [commandEncoder setTexture:metalTexture atIndex:0];
    [commandEncoder setBuffer:metalBuffer offset:0 atIndex:0];
    [commandEncoder setBytes:&imageW length:sizeof(ushort) atIndex:1];
    [commandEncoder setBytes:&imageH length:sizeof(ushort) atIndex:2];

    [commandEncoder dispatchThreadgroups:threadgroups threadsPerThreadgroup:threadgroupCounts];
    [commandEncoder endEncoding];

    [commandBuffer commit];
    [commandBuffer waitUntilCompleted];

目的是拍摄大小为 mxn 的原始图像并将其打包成一个纹理,例如2048x896。这是我的内核着色器:

kernel void stripe_Kernel(texture2d<float, access::write> outTexture [[ texture(0) ]],
                      device const float *inBuffer [[ buffer(0) ]],
                      device const ushort * imageWidth [[ buffer(1) ]],
                      device const ushort * imageHeight [[ buffer(2) ]],
                      uint2 gid [[ thread_position_in_grid ]])
{
    const ushort imageW = *imageWidth;
    const ushort imageH = *imageHeight;

    const uint32_t textureW = outTexture.get_width();  // eg. 2048

    uint32_t posX = gid.x;  // eg. 0...2047
    uint32_t posY = gid.y;  // eg. 0...895

    uint32_t sourceX = ((int)(posY/imageH)*textureW + posX) % imageW;
    uint32_t sourceY = (int)(posY% imageH);

    const uint32_t ptr = (sourceX + sourceY* imageW);
    float pixel = inBuffer[ptr];

    outTexture.write(pixel, gid);
}

我后来抓住了纹理缓冲区并将其转换为CVPixelBuffer:

MTLRegion region = MTLRegionMake2D(0, 0, (int)outputSize.width, (int)outputSize.height);
// lock buffers, copy texture over
CVPixelBufferLockBaseAddress(outBuffer, 0);
void *pixelData = CVPixelBufferGetBaseAddress(outBuffer);
[metalTexture getBytes:CVPixelBufferGetBaseAddress(outBuffer)
           bytesPerRow:CVPixelBufferGetBytesPerRow(outBuffer)
            fromRegion:region
           mipmapLevel:0];
CVPixelBufferUnlockBaseAddress(outBuffer, 0);

我的问题是这个,我的CVPixelBuffer总是空的(分配但是为零)。使用Radeon M395 GPU在iMac 17,1上运行。

我甚至将不透明的红色像素撞到内核着色器的输出纹理中。不过,我甚至看不到红色。

更新:我对此问题的解决方案是完全放弃使用MTLTextures(我甚至尝试使用MTLBlitCommandEncoder进行纹理同步) - 没有骰子。

我最终使用MTLBuffers输入“纹理”和输出“纹理”,而是在内核着色器中重写数学。我的输出缓冲区现在是一个预先分配的锁定CVPixelBuffer,这是我最终想要的。

1 个答案:

答案 0 :(得分:1)

首先,使用MTLTextureUsage.renderTarget我得到错误&#34; validateComputeFunctionArguments:825:失败断言`函数写纹理(outTexture [0]),其用法(0x04)不指定MTLTextureUsageShaderWrite(0x02)&# 39;&#34;所以它应该是MTLTextureUsage.shaderWrite。

出于某种原因,如果我强制使用带有gfxSwitch的英特尔GPU,纹理的回读会返回正确的数据,但是对于Radeon来说,无论纹理是什么,纹理都是零,纹理描述.-纹理定义:#34; textureDesc.resourceOptions = MTLResourceOptions.storageModeXXX&#34;标志。

英特尔和Radeon 460对我有用的是创建MTLBuffer并使用它代替纹理。但是,您必须计算索引。如果您不使用mip映射或使用浮点索引进行采样,那么切换到缓冲区应该不是很重要,对吗?

让texBuffer = device?.makeBuffer(length:4 * width * height,options:MTLResourceOptions.storageModeShared)

var result = [Float](重复:0,count:width * height * 4) let data = NSData(bytesNoCopy:texBuffer!.contents(),length:4 * width * height,freeWhenDone:false) data.getBytes(&amp; result,length:4 * width * height)

我认为创建由MTLBuffer支持的纹理会起作用,但api仅在OSX 10.13中。

编辑:正如Ken Thomases所指出,在Metal kernels not behaving properly on the new MacBook Pro (late 2016) GPUs

进行了类似的讨论

我使用此线程的第一篇文章中的方法和着色器制作了一个示例应用程序,并且链接线程的修复程序适用于我。以下是应用程序代码的链接,以防有人想要一个可重现的示例。 https://gist.github.com/astarasikov/9e4f58e540a6ff066806d37eb5b2af29