AVCapture似乎滞后,文本识别无法立即开始

时间:2019-04-24 19:52:23

标签: ios swift avfoundation firebase-mlkit

我是Swift的新手,目前正在开发包含文本识别的功能。我正在使用Firebase的MLKit并拥有代码,尽管它不是很复杂(我愿意接受所有建议以改进我的编码),但已经进行了很多设置。

无论如何,有两件事困扰着我:

  1. 自从我添加了文本识别功能以来,实时Feed似乎滞后了(大约每秒1帧)-我认为这是某种程度上由文本识别功能引起的,以防止过载?如果是,如何断开实时取景和处理的帧?
  2. 文本识别似乎在10秒后开始。有没有办法让它立即开始?

CameraViewController:

import UIKit
import AVKit
import Vision
import FirebaseMLVision


class CameraViewController: UIViewController, AVCaptureVideoDataOutputSampleBufferDelegate {    

    private lazy var vision = Vision.vision()
    private lazy var textRecognizer = vision.onDeviceTextRecognizer()

    override func viewDidLoad() {
        super.viewDidLoad()
        captureSession()
    }

    func captureSession () {
        let captureSession = AVCaptureSession()

        guard let captureDevice = AVCaptureDevice.default(for: .video) else { return }
        guard let input = try? AVCaptureDeviceInput(device: captureDevice) else{ return }
        captureSession.addInput(input)

        captureSession.startRunning()

        let previewLayer = AVCaptureVideoPreviewLayer(session: captureSession)
        view.layer.addSublayer(previewLayer)
        previewLayer.frame = view.frame

        let dataOutput = AVCaptureVideoDataOutput()
        dataOutput.setSampleBufferDelegate(self, queue: DispatchQueue(label: "videoQueue"))
        captureSession.addOutput(dataOutput)
    }

    func captureOutput(_ output: AVCaptureOutput, didOutput sampleBuffer: CMSampleBuffer, from connection: AVCaptureConnection) {


        let metadata = VisionImageMetadata()

        let devicePosition: AVCaptureDevice.Position = .back

        let deviceOrientation = UIDevice.current.orientation

        switch deviceOrientation {
        case .portrait:
            metadata.orientation = devicePosition == .front ? .leftTop : .rightTop
        case .landscapeLeft:
            metadata.orientation = devicePosition == .front ? .bottomLeft : .topLeft
        case .portraitUpsideDown:
            metadata.orientation = devicePosition == .front ? .rightBottom : .leftBottom
        case .landscapeRight:
            metadata.orientation = devicePosition == .front ? .topRight : .bottomRight
        case .faceDown, .faceUp, .unknown:
            metadata.orientation = .leftTop
        }

        let image = VisionImage(buffer: sampleBuffer)
        image.metadata = metadata

        textRecognizer.process(image) { result, error in
            guard error == nil, let result = result else {
                return
            }

            for block in result.blocks {
                for line in block.lines {
                    for element in line.elements {
                        let elementText = element.text
                        print(element.text)
                    }
                }
            }
        }
    }

}

1 个答案:

答案 0 :(得分:1)

您需要更新您的 AVCaptureVideoDataOutput

output.alwaysDiscardsLateVideoFrames = true

https://github.com/googlecodelabs/mlkit-ios/blob/master/translate/TranslateDemo/CameraViewController.swift#L307