如何在iOS 11中使用新的Vision框架在头部或相机移动时跟踪视频中的眼睛? (使用前置摄像头)。
我发现VNDetectFaceLandmarksRequest
在我的iPad上速度很慢 - 地标请求大约在1-2秒内执行一次。我觉得我做错了什么,但Apple的网站上没有太多文档。
我已经在Vision上观看了WWDC 2017视频:
https://developer.apple.com/videos/play/wwdc2017/506/
以及阅读本指南:
https://github.com/jeffreybergier/Blog-Getting-Started-with-Vision
我的代码现在看起来大致如此(抱歉,它是Objective-C):
// Capture session setup
- (BOOL)setUpCaptureSession {
AVCaptureDevice *captureDevice = [AVCaptureDevice
defaultDeviceWithDeviceType:AVCaptureDeviceTypeBuiltInWideAngleCamera
mediaType:AVMediaTypeVideo
position:AVCaptureDevicePositionFront];
NSError *error;
AVCaptureDeviceInput *captureInput = [AVCaptureDeviceInput deviceInputWithDevice:captureDevice error:&error];
if (error != nil) {
NSLog(@"Failed to initialize video input: %@", error);
return NO;
}
self.captureOutputQueue = dispatch_queue_create("CaptureOutputQueue",
DISPATCH_QUEUE_SERIAL);
AVCaptureVideoDataOutput *captureOutput = [[AVCaptureVideoDataOutput alloc] init];
captureOutput.alwaysDiscardsLateVideoFrames = YES;
[captureOutput setSampleBufferDelegate:self queue:self.captureOutputQueue];
self.captureSession = [[AVCaptureSession alloc] init];
self.captureSession.sessionPreset = AVCaptureSessionPreset1280x720;
[self.captureSession addInput:captureInput];
[self.captureSession addOutput:captureOutput];
return YES;
}
// Capture output delegate:
- (void)captureOutput:(AVCaptureOutput *)output
didOutputSampleBuffer:(CMSampleBufferRef)sampleBuffer
fromConnection:(AVCaptureConnection *)connection {
if (!self.detectionStarted) {
return;
}
CVPixelBufferRef pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer);
if (pixelBuffer == nil) {
return;
}
NSMutableDictionary<VNImageOption, id> *requestOptions = [NSMutableDictionary dictionary];
CFTypeRef cameraIntrinsicData = CMGetAttachment(sampleBuffer,
kCMSampleBufferAttachmentKey_CameraIntrinsicMatrix,
nil);
requestOptions[VNImageOptionCameraIntrinsics] = (__bridge id)(cameraIntrinsicData);
// TODO: Detect device orientation
static const CGImagePropertyOrientation orientation = kCGImagePropertyOrientationRight;
VNDetectFaceLandmarksRequest *landmarksRequest =
[[VNDetectFaceLandmarksRequest alloc] initWithCompletionHandler:^(VNRequest *request, NSError *error) {
if (error != nil) {
NSLog(@"Error while detecting face landmarks: %@", error);
} else {
dispatch_async(dispatch_get_main_queue(), ^{
// Draw eyes in two corresponding CAShapeLayers
});
}
}];
VNImageRequestHandler *requestHandler = [[VNImageRequestHandler alloc] initWithCVPixelBuffer:pixelBuffer
orientation:orientation
options:requestOptions];
NSError *error;
if (![requestHandler performRequests:@[landmarksRequest] error:&error]) {
NSLog(@"Error performing landmarks request: %@", error);
return;
}
}
在与视频输出相同的队列上调用-performRequests:..
是否正确?根据我的实验,这个方法似乎同步调用请求的完成处理程序。我不应该在每一帧都调用这种方法吗?
为了加快速度,我还尝试使用VNTrackObjectRequest
在视频上检测到地标后单独跟踪每只眼睛(通过从地标和地区点构建一个边界框),但是没有#39;工作得很好(仍然试图弄清楚)。
跟踪视频眼睛的最佳策略是什么?我应该跟踪一个面部矩形并然后在其区域内执行地标请求(会更快)吗?