Question

Apple有一个有用的教程，名为Displaying an AR Experience with Metal，向您展示了如何从ARFrame的{{1}}属性中提取Y和CbCr纹理并将其转换为RGB进行渲染。但是，在尝试获取RGBA纹理并执行逆操作（即转换回Y和CbCr纹理）时遇到了问题。

我在本教程中将片段着色器重写为一个计算着色器，可以写入我从金属缓冲区创建的rgba纹理：

capturedImage

我试图编写第二个计算着色器以执行相反的操作，使用YCbCr的wikipedia page上的转换公式计算Y，Cb和Cr值：

// Same as capturedImageFragmentShader but it's a kernel function instead
kernel void yCbCrToRgbKernel(texture2d<float, access::sample> yTexture [[ texture(kTextureIndex_Y) ]],
                             texture2d<float, access::sample> cbCrTexture [[ texture(kTextureIndex_CbCr) ]],
                             texture2d<float, access::write> rgbaTexture [[ texture(kTextureIndex_RGBA) ]],
                             uint2 gid [[ thread_position_in_grid ]])
{
    constexpr sampler colorSampler(mip_filter::linear, mag_filter::linear, min_filter::linear);

    const float4x4 ycbcrToRGBTransform = float4x4(
        float4(+1.0000f, +1.0000f, +1.0000f, +0.0000f),
        float4(+0.0000f, -0.3441f, +1.7720f, +0.0000f),
        float4(+1.4020f, -0.7141f, +0.0000f, +0.0000f),
        float4(-0.7010f, +0.5291f, -0.8860f, +1.0000f)
    );

    float4 ycbcr = float4(yTexture.sample(colorSampler, float2(gid)).r, cbCrTexture.sample(colorSampler, float2(gid)).rg, 1.0);
    float4 result = ycbcrToRGBTransform * ycbcr;
    rgbaTexture.write(result, ushort2(gid));
}

我的问题是如何正确地将数据写入这些纹理。我知道这是不正确的，因为结果显示为纯粉红色。预期结果显然是原始的，未修改的显示。

Y，CbCr和RGBA纹理的像素格式分别为kernel void rgbaToYCbCrKernel(texture2d<float, access::write> yTexture [[ texture(kTextureIndex_Y) ]], texture2d<float, access::write> cbCrTexture [[ texture(kTextureIndex_CbCr) ]], texture2d<float, access::sample> rgbaTexture [[ texture(kTextureIndex_RGBA) ]], uint2 gid [[ thread_position_in_grid ]]) { constexpr sampler colorSampler(mip_filter::linear, mag_filter::linear, min_filter::linear); float4 rgba = rgbaTexture.sample(colorSampler, float2(gid)).rgba; // see https://en.wikipedia.org/wiki/YCbCr#ITU-R_BT.709_conversion for conversion formulae float Y = 16.0 + (65.481 * rgba.r + 128.553 * rgba.g + 24.966 * rgba.b); float Cb = 128 + (-37.797 * rgba.r + 74.203 * rgba.g + 112.0 * rgba.b); float Cr = 128 + (112.0 * rgba.r + 93.786 * rgba.g - 18.214 * rgba.b); yTexture.write(Y, gid); cbCrTexture.write(float4(Cb, Cr, 0, 0), gid); // this probably is not correct... }，.r8UNorm和.rg8UNorm。

这是我用于设置纹理和执行着色器的快速代码：

rgba8UNorm

最终目标是使用金属着色器对rgba纹理进行图像处理，并最终写回Y和CbCr纹理以在屏幕上显示。

这是我不确定的部分

鉴于内核函数中纹理的类型为private func createTexture(fromPixelBuffer pixelBuffer: CVPixelBuffer, pixelFormat: MTLPixelFormat, planeIndex: Int) -> MTLTexture? { guard CVMetalTextureCacheCreate(kCFAllocatorSystemDefault, nil, device, nil, &capturedImageTextureCache) == kCVReturnSuccess else { return nil } var mtlTexture: MTLTexture? = nil let width = CVPixelBufferGetWidthOfPlane(pixelBuffer, planeIndex) let height = CVPixelBufferGetHeightOfPlane(pixelBuffer, planeIndex) var texture: CVMetalTexture? = nil let status = CVMetalTextureCacheCreateTextureFromImage(nil, capturedImageTextureCache!, pixelBuffer, nil, pixelFormat, width, height, planeIndex, &texture) if status == kCVReturnSuccess { mtlTexture = CVMetalTextureGetTexture(texture!) } return mtlTexture } func arFrameToRGB(frame: ARFrame) { let frameBuffer = frame.capturedImage CVPixelBufferLockBaseAddress(frameBuffer, CVPixelBufferLockFlags(rawValue: 0)) // Extract Y and CbCr textures let capturedImageTextureY = createTexture(fromPixelBuffer: frameBuffer, pixelFormat: .r8Unorm, planeIndex: 0)! let capturedImageTextureCbCr = createTexture(fromPixelBuffer: frameBuffer, pixelFormat: .rg8Unorm, planeIndex: 1)! // create the RGBA texture let rgbaBufferWidth = CVPixelBufferGetWidthOfPlane(frameBuffer, 0) let rgbaBufferHeight = CVPixelBufferGetHeightOfPlane(frameBuffer, 0) if rgbaBuffer == nil { rgbaBuffer = device.makeBuffer(length: 4 * rgbaBufferWidth * rgbaBufferHeight, options: []) } let rgbaTextureDescriptor = MTLTextureDescriptor.texture2DDescriptor(pixelFormat: .rgba8Unorm, width: rgbaBufferWidth, height: rgbaBufferHeight, mipmapped: false) rgbaTextureDescriptor.usage = [.shaderWrite, .shaderRead] let rgbaTexture = rgbaBuffer?.makeTexture(descriptor: rgbaTextureDescriptor, offset: 0, bytesPerRow: 4 * rgbaBufferWidth) threadGroupSize = MTLSizeMake(4, 4, 1) threadGroupCount = MTLSizeMake((rgbaTexture!.width + threadGroupSize!.width - 1) / threadGroupSize!.width, (rgbaTexture!.height + threadGroupSize!.height - 1) / threadGroupSize!.height, 1) let yCbCrToRGBACommandBuffer = commandQueue.makeCommandBuffer()! let yCbCrToRGBAComputeEncoder = yCbCrToRGBACommandBuffer.makeComputeCommandEncoder()! yCbCrToRGBAComputeEncoder.setComputePipelineState(yCbCrToRgbPso) yCbCrToRGBAComputeEncoder.setTexture(capturedImageTextureY, index: Int(kTextureIndex_Y.rawValue)) yCbCrToRGBAComputeEncoder.setTexture(capturedImageTextureCbCr, index: Int(kTextureIndex_CbCr.rawValue)) yCbCrToRGBAComputeEncoder.setTexture(rgbaTexture, index: Int(kTextureIndex_RGBA.rawValue)) yCbCrToRGBAComputeEncoder.dispatchThreadgroups(threadGroupCount!, threadsPerThreadgroup: threadGroupSize!) yCbCrToRGBAComputeEncoder.endEncoding() let rgbaToYCbCrCommandBuffer = commandQueue.makeCommandBuffer()! let rgbaToYCbCrComputeEncoder = rgbaToYCbCrCommandBuffer.makeComputeCommandEncoder()! rgbaToYCbCrComputeEncoder.setComputePipelineState(rgbaToYCbCrPso) rgbaToYCbCrComputeEncoder.setTexture(capturedImageTextureY, index: Int(kTextureIndex_Y.rawValue)) rgbaToYCbCrComputeEncoder.setTexture(capturedImageTextureCbCr, index: Int(kTextureIndex_CbCr.rawValue)) rgbaToYCbCrComputeEncoder.setTexture(rgbaTexture, index: Int(kTextureIndex_RGBA.rawValue)) rgbaToYCbCrComputeEncoder.dispatchThreadgroups(threadGroupCount!, threadsPerThreadgroup: threadGroupSize!) rgbaToYCbCrComputeEncoder.endEncoding() yCbCrToRGBACommandBuffer.commit() rgbaToYCbCrCommandBuffer.commit() yCbCrToRGBACommandBuffer.waitUntilCompleted() rgbaToYCbCrCommandBuffer.waitUntilCompleted() CVPixelBufferUnlockBaseAddress(frameBuffer, CVPixelBufferLockFlags(rawValue: 0)) }但像素格式不同，如何将正确格式的数据写入这些纹理？
我是否像我想的那样简单地将Displaying an AR Experience with Metal中的texture2d<float, access::write>重写为计算着色器，还是在那里丢失了某些东西？

Answer 1

我只需要实现同样的事情。您的第一个问题是存储在纹理缓冲区中的值与这些值在 Metal 内核中的显示方式之间存在混淆。与 GPU 着色器中的典型情况一样，当整数值作为浮点数访问时，它们在读取时会标准化为 [0,1]，并在写入时缩减为 [0,MaxIntValue]。对于 Metal，此转换已记录在第 228 页的 https://developer.apple.com/metal/Metal-Shading-Language-Specification.pdf “7.7.1.1 将规范化的整数像素数据类型转换为浮点值”。

例如，如果 Y 通道的纹理格式为 .r8UNorm，则数据以每像素 1 个字节存储，值从 0 到 255。但是一旦通过 texture2d<float> 在内核中访问，值将在 [0,1] 中。当您写入这样的纹理时，这些值会自动缩小到 [0,255]。因此，在您的内核中，您应该考虑处理的是 [0,1] 而不是 [0,255] 内的值，并相应地调整您的变换。

第二个问题是 RGBA 到 YCbCr 的变换本身。假设来自 Apple 的样本是正确的，我们可以看到它们遵循 wikipedia page 末尾给出的 JPEG 约定。如果将 128 替换为 128/255=0.5 并将其放入矩阵形式，则系数完全匹配。额外的微妙之处在于，矩阵在 Metal 代码中以列优先模式进行初始化，因此相应的数学运算应为：

       |+1.     +0.     +1.402  -0.701 |   |Y |
       |+1.     -0.3441 -0.7141 +0.5291|   |Cb|
RGBA = |+1.     +1.772  +0.     -0.886 | . |Cr|
       |+0.     +0.     +0.     +1.    |   |1 |

接下来您需要的是逆变换。您可以在维基百科页面的同一 JPEG 部分找到它（再次用 0.5 替换 128），或者如果您想使用相同的矩阵形式，您可以简单地计算 4x4 矩阵的逆矩阵并使用它。这是 what I did，我把它放回 column-major 后得到了它：

const float4x4 rgbaToYcbcrTransform = float4x4(
   float4(+0.2990, -0.1687, +0.5000, +0.0000),
   float4(+0.5870, -0.3313, -0.4187, +0.0000),
   float4(+0.1140, +0.5000, -0.0813, +0.0000),
   float4(+0.0000, +0.5000, +0.5000, +1.0000)
);

然后调整你的内核代码应该可以工作（我没有测试那个确切的代码，我的纹理布局略有不同）：

// Ignore alpha as we can't convert it, just set it to 1.
float3 rgb = rgbaTexture.sample(colorSampler, float2(gid)).rgb;
float4 ycbcr = rgbaToYcbcrTransform * float4(rgb, 1.0);    
yTexture.write(ycbcr[0], gid);
cbCrTexture.write(float4(ycbcr[1], ycbcr[2], 0, 0), gid);

如何在金属中将RGBA纹理转换为Y和CbCr纹理

1 个答案: