从Swift中的CVPixelBufferRef获取像素值

时间:2016-01-02 19:21:30

标签: ios swift image-processing cvpixelbuffer

如何从CVPixelBufferRef获取RGB(或任何其他格式)像素值?我尝试了很多方法,但还没有成功。

        func captureOutput(captureOutput: AVCaptureOutput!, didOutputSampleBuffer sampleBuffer: CMSampleBuffer!, fromConnection connection: AVCaptureConnection!) {

            let pixelBuffer: CVPixelBufferRef = CMSampleBufferGetImageBuffer(sampleBuffer)!
            CVPixelBufferLockBaseAddress(pixelBuffer, 0)
            let baseAddress = CVPixelBufferGetBaseAddress(pixelBuffer)


    //Get individual pixel values here

    CVPixelBufferUnlockBaseAddress(pixelBuffer, 0)
        }

4 个答案:

答案 0 :(得分:14)

baseAddress是一个不安全的可变指针,更确切地说是UnsafeMutablePointer<Void>。将指针从Void转换为更具体的类型后,您可以轻松访问内存:

// Convert the base address to a safe pointer of the appropriate type
let byteBuffer = UnsafeMutablePointer<UInt8>(baseAddress)

// read the data (returns value of type UInt8)
let firstByte = byteBuffer[0]

// write data
byteBuffer[3] = 90

确保使用正确的类型(8,16或32位无符号整数)。这取决于视频格式。最有可能是8位。

缓冲区格式更新:

您可以在初始化AVCaptureVideoDataOutput实例时指定格式。你基本上可以选择:

  • BGRA:蓝色,绿色,红色和alpha值以32位整数存储的单个平面
  • 420YpCbCr8BiPlanarFullRange:两个平面,第一个包含每个像素的字节,具有Y(亮度)值,第二个包含像素组的Cb和Cr(色度)值
  • 420YpCbCr8BiPlanarVideoRange:与420YpCbCr8BiPlanarFullRange相同,但Y值限制在16 - 235范围内(由于历史原因)

如果您对颜色值感兴趣并且速度(或者说最大帧速率)不是问题,那么请选择更简单的BGRA格式。否则采用一种更有效的原生视频格式。

如果你有两架飞机,你必须得到所需飞机的基地址(见视频格式示例):

视频格式示例

let pixelBuffer: CVPixelBufferRef = CMSampleBufferGetImageBuffer(sampleBuffer)!
CVPixelBufferLockBaseAddress(pixelBuffer, 0)
let baseAddress = CVPixelBufferGetBaseAddressOfPlane(pixelBuffer, 0)
let bytesPerRow = CVPixelBufferGetBytesPerRowOfPlane(pixelBuffer, 0)
let byteBuffer = UnsafeMutablePointer<UInt8>(baseAddress)

// Get luma value for pixel (43, 17)
let luma = byteBuffer[17 * bytesPerRow + 43]

CVPixelBufferUnlockBaseAddress(pixelBuffer, 0)

BGRA示例

let pixelBuffer: CVPixelBufferRef = CMSampleBufferGetImageBuffer(sampleBuffer)!
CVPixelBufferLockBaseAddress(pixelBuffer, 0)
let baseAddress = CVPixelBufferGetBaseAddress(pixelBuffer)
let int32PerRow = CVPixelBufferGetBytesPerRow(pixelBuffer)
let int32Buffer = UnsafeMutablePointer<UInt32>(baseAddress)

// Get BGRA value for pixel (43, 17)
let luma = int32Buffer[17 * int32PerRow + 43]

CVPixelBufferUnlockBaseAddress(pixelBuffer, 0)

答案 1 :(得分:8)

Swift3的更新:

let pixelBuffer: CVPixelBufferRef = CMSampleBufferGetImageBuffer(sampleBuffer)!
CVPixelBufferLockBaseAddress(pixelBuffer, CVPixelBufferLockFlags(rawValue: 0));
let int32Buffer = unsafeBitCast(CVPixelBufferGetBaseAddress(pixelBuffer), to: UnsafeMutablePointer<UInt32>.self)
let int32PerRow = CVPixelBufferGetBytesPerRow(pixelBuffer)
// Get BGRA value for pixel (43, 17)
let luma = int32Buffer[17 * int32PerRow + 43]

CVPixelBufferUnlockBaseAddress(pixelBuffer, 0)

答案 2 :(得分:3)

除了Codos的回答,这里有一个从BRGA像素缓冲区获取各个rgb值的方法。注意:在调用之前必须锁定缓冲区。

func pixelFrom(x: Int, y: Int, movieFrame: CVPixelBuffer) -> (UInt8, UInt8, UInt8) {
    let baseAddress = CVPixelBufferGetBaseAddress(movieFrame)

    let width = CVPixelBufferGetWidth(movieFrame)
    let height = CVPixelBufferGetHeight(movieFrame)

    let bytesPerRow = CVPixelBufferGetBytesPerRow(movieFrame)
    let buffer = baseAddress!.assumingMemoryBound(to: UInt8.self)

    let index = x+y*bytesPerRow
    let b = buffer[index]
    let g = buffer[index+1]
    let r = buffer[index+2]

    return (r, g, b)
}

答案 3 :(得分:0)

快速5

我遇到了同样的问题,并得到以下解决方案。我的CVPixelBuffer的尺寸为68 x 68,可以通过

检查
CVPixelBufferLockBaseAddress(pixelBuffer, CVPixelBufferLockFlags(rawValue: 0))
print(CVPixelBufferGetWidth(pixelBuffer))
print(CVPixelBufferGetHeight(pixelBuffer))

您还必须知道每行的字节数:

print(CVPixelBufferGetBytesPerRow(pixelBuffer))

在我的情况下为320。

此外,您需要了解像素缓冲区的数据类型,这对我来说是Float32

然后,我构造了一个字节缓冲区,并按如下所示连续读取字节(请记住,如上所示锁定基地址):

var byteBuffer = unsafeBitCast(CVPixelBufferGetBaseAddress(pixelBuffer), to: UnsafeMutablePointer<Float32>.self)
var pixelArray: Array<Array<Float>> = Array(repeating: Array(repeating: 0, count: 68), count: 68)
for row in 0...67{
    for col in 0...67{
        pixelArray[row][col] = byteBuffer.pointee
        byteBuffer = byteBuffer.successor()    
    }
    byteBuffer = byteBuffer.advanced(by: 12)
}
CVPixelBufferUnlockBaseAddress(pixelBuffer, CVPixelBufferLockFlags(rawValue: 0))

您可能会对byteBuffer = byteBuffer.advanced(by: 12)部分感到好奇。我们必须这样做的原因如下。

我们知道每行有320个字节。但是,我们的缓冲区的宽度为68,数据类型为Float32,例如每个值4个字节。这意味着我们实际上每行只有272个字节,后跟零填充。零填充可能有内存布局方面的原因。

因此,我们必须跳过byteBuffer = byteBuffer.advanced(by: 12)12*4 = 48)完成的每一行的最后48个字节。

此方法与其他解决方案有所不同,因为我们使用指向下一个byteBuffer的指针。但是,我发现这更简单,更直观。