Question

从VNClassificationObservation获取问题。

我的目标是识别对象并使用对象名称显示弹出窗口，我能够获得名称，但我无法获得对象坐标或框架。

这是代码：

let handler = VNImageRequestHandler(cvPixelBuffer: pixelBuffer, options: requestOptions)
do {
    try handler.perform([classificationRequest, detectFaceRequest])
} catch {
    print(error)
}

然后我处理

func handleClassification(request: VNRequest, error: Error?) {
      guard let observations = request.results as? [VNClassificationObservation] else {
          fatalError("unexpected result type from VNCoreMLRequest")
      }

    // Filter observation
    let filteredOservations = observations[0...10].filter({ $0.confidence > 0.1 })

    // Update UI
   DispatchQueue.main.async { [weak self] in

    for  observation in filteredOservations {
            print("observation: ",observation.identifier)
            //HERE: I need to display popup with observation name
    }
  }
}

更新：

lazy var classificationRequest: VNCoreMLRequest = {

    // Load the ML model through its generated class and create a Vision request for it.
    do {
        let model = try VNCoreMLModel(for: Inceptionv3().model)
        let request = VNCoreMLRequest(model: model, completionHandler: self.handleClassification)
        request.imageCropAndScaleOption = VNImageCropAndScaleOptionCenterCrop
        return request
    } catch {
        fatalError("can't load Vision ML model: \(error)")
    }
}()

Answer 1

纯粹的分类器模型只能回答＆＃34;这是什么？＆＃34;，而不是检测和定位图片中的对象。所有free models on the Apple developer site（包括Inception v3）都属于这种类型。

当Vision使用这样的模型时，它会根据MLModel文件中声明的输出将模型识别为分类器，并返回VNClassificationObservation个对象作为输出。

如果您找到或创建了经过训练以识别和定位对象的模型，您仍然可以将其与Vision一起使用。将该模型转换为Core ML格式时，MLModel文件将描述多个输出。当Vision使用具有多个输出的模型时，它返回一个VNCoreMLFeatureValueObservation个对象的数组 - 每个模型的输出一个。

模型如何声明其输出将决定哪些特征值代表什么。报告分类和边界框的模型可以输出字符串和四个双精度，或字符串和多个数组等。

附录：此处的模型适用于iOS 11并返回VNCoreMLFeatureValueObservation：TinyYOLO

Answer 2

那是因为分类器不返回对象坐标或帧。分类器仅在类别列表中给出概率分布。

你在这里使用什么型号？

如何从VNClassificationObservation获取对象rect / coordinates

2 个答案: