使用Vision框架跟踪眼睛

时间:2017-11-03 10:28:54

标签: ios objective-c computer-vision ios-vision

如何在iOS 11中使用新的Vision框架在头部或相机移动时跟踪视频中的眼睛? (使用前置摄像头)。

我发现VNDetectFaceLandmarksRequest在我的iPad上速度很慢 - 地标请求大约在1-2秒内执行一次。我觉得我做错了什么,但Apple的网站上没有太多文档。

我已经在Vision上观看了WWDC 2017视频:

https://developer.apple.com/videos/play/wwdc2017/506/

以及阅读本指南:

https://github.com/jeffreybergier/Blog-Getting-Started-with-Vision

我的代码现在看起来大致如此(抱歉,它是Objective-C):

// Capture session setup

- (BOOL)setUpCaptureSession {
    AVCaptureDevice *captureDevice = [AVCaptureDevice
                                      defaultDeviceWithDeviceType:AVCaptureDeviceTypeBuiltInWideAngleCamera
                                      mediaType:AVMediaTypeVideo
                                      position:AVCaptureDevicePositionFront];
    NSError *error;
    AVCaptureDeviceInput *captureInput = [AVCaptureDeviceInput deviceInputWithDevice:captureDevice error:&error];
    if (error != nil) {
        NSLog(@"Failed to initialize video input: %@", error);
        return NO;
    }

    self.captureOutputQueue = dispatch_queue_create("CaptureOutputQueue",
                                                    DISPATCH_QUEUE_SERIAL);

    AVCaptureVideoDataOutput *captureOutput = [[AVCaptureVideoDataOutput alloc] init];
    captureOutput.alwaysDiscardsLateVideoFrames = YES;
    [captureOutput setSampleBufferDelegate:self queue:self.captureOutputQueue];

    self.captureSession = [[AVCaptureSession alloc] init];
    self.captureSession.sessionPreset = AVCaptureSessionPreset1280x720;
    [self.captureSession addInput:captureInput];
    [self.captureSession addOutput:captureOutput];

    return YES;
}

// Capture output delegate:

- (void)captureOutput:(AVCaptureOutput *)output
didOutputSampleBuffer:(CMSampleBufferRef)sampleBuffer
       fromConnection:(AVCaptureConnection *)connection {
    if (!self.detectionStarted) {
        return;
    }

    CVPixelBufferRef pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer);
    if (pixelBuffer == nil) {
        return;
    }

    NSMutableDictionary<VNImageOption, id> *requestOptions = [NSMutableDictionary dictionary];
    CFTypeRef cameraIntrinsicData = CMGetAttachment(sampleBuffer,
                                                    kCMSampleBufferAttachmentKey_CameraIntrinsicMatrix,
                                                    nil);
    requestOptions[VNImageOptionCameraIntrinsics] = (__bridge id)(cameraIntrinsicData);

    // TODO: Detect device orientation
    static const CGImagePropertyOrientation orientation = kCGImagePropertyOrientationRight;

    VNDetectFaceLandmarksRequest *landmarksRequest =
        [[VNDetectFaceLandmarksRequest alloc] initWithCompletionHandler:^(VNRequest *request, NSError *error) {
            if (error != nil) {
                NSLog(@"Error while detecting face landmarks: %@", error);
            } else {
                dispatch_async(dispatch_get_main_queue(), ^{
                    // Draw eyes in two corresponding CAShapeLayers
                });
            }
        }];


    VNImageRequestHandler *requestHandler = [[VNImageRequestHandler alloc] initWithCVPixelBuffer:pixelBuffer
                                                                                     orientation:orientation
                                                                                         options:requestOptions];
    NSError *error;
    if (![requestHandler performRequests:@[landmarksRequest] error:&error]) {
        NSLog(@"Error performing landmarks request: %@", error);
        return;
    }
}

在与视频输出相同的队列上调用-performRequests:..是否正确?根据我的实验,这个方法似乎同步调用请求的完成处理程序。我不应该在每一帧都调用这种方法吗?

为了加快速度,我还尝试使用VNTrackObjectRequest在视频上检测到地标后单独跟踪每只眼睛(通过从地标和地区点构建一个边界框),但是没有#39;工作得很好(仍然试图弄清楚)。

跟踪视频眼睛的最佳策略是什么?我应该跟踪一个面部矩形并然后在其区域内执行地标请求(会更快)吗?

0 个答案:

没有答案