Question

我尝试使用Google Vision API在视频中进行人脸检测。我使用以下代码：

import argparse
import cv2
from google.cloud import vision
from PIL import Image, ImageDraw


def detect_face(face_file, max_results=4):
    """Uses the Vision API to detect faces in the given file.
    Args:
        face_file: A file-like object containing an image with faces.
    Returns:
        An array of Face objects with information about the picture.
    """
    content = face_file.read()
    # [START get_vision_service]
    image = vision.Client().image(content=content)
    # [END get_vision_service]

    return image.detect_faces()


def highlight_faces(frame, faces, output_filename):
    """Draws a polygon around the faces, then saves to output_filename.
    Args:
      image: a file containing the image with the faces.
      faces: a list of faces found in the file. This should be in the format
          returned by the Vision API.
      output_filename: the name of the image file to be created, where the
          faces have polygons drawn around them.
    """
    im = Image.open(frame)
    draw = ImageDraw.Draw(im)

    for face in faces:
        box = [(bound.x_coordinate, bound.y_coordinate)
               for bound in face.bounds.vertices]
        draw.line(box + [box[0]], width=5, fill='#00ff00')

    #im.save(output_filename)


def main(input_filename, max_results):

    video_capture = cv2.VideoCapture(input_filename)


    while True:
        # Capture frame-by-frame
        ret, frame = video_capture.read()
        faces = detect_face(frame, max_results)
        highlight_faces(frame, faces)
        cv2.imshow('Video', frame)
        if cv2.waitKey(1) & 0xFF == ord('q'):
            break


if __name__ == '__main__':
    parser = argparse.ArgumentParser(
        description='Detects faces in the given image.')
    parser.add_argument(
        'input_image', help='the image you\'d like to detect faces in.')
    parser.add_argument(
        '--max-results', dest='max_results', default=4,
        help='the max results of face detection.')
    args = parser.parse_args()

    main(args.input_image, args.max_results)

但我收到了错误：

content = face_file.read（）AttributeError：＆＃39; numpy.ndarray＆＃39;对象有没有属性＆＃39;阅读＆＃39;

＆＃34;帧＆＃34;被读作numpy数组。但不知道如何绕过它们。

有人可以帮助我吗？

Answer 1

detect_face函数期望类文件对象从中读取数据。一种可能的方法是将frame（类型为numpy.ndarray）转换为图像，并将其放入缓冲区，然后可以像文件一样读取。

例如，尝试对代码进行以下更改：

## Add some imports.
import io
import numpy as np
...

def main(input_filename, max_results):
    ...
    while True:
        # Capture frame-by-frame
        ret, frame = video_capture.read()

        ## Convert to an image, then write to a buffer.
        image_from_frame = Image.fromarray(np.unit8(frame))
        buffer = io.BytesIO()
        image_from_frame.save(buffer, format='PNG')
        buffer.seek(0)

        ## Use the buffer like a file.
        faces = detect_face(buffer, max_results)

        ...

注意：应该有一种方法可以在视觉API客户端中使用image_from_frame.tobytes()作为图像内容，但我无法使其正常工作。

使用Google Cloud API在视频中进行人脸检测

1 个答案: