Question

我正在构建一些代码来自适应地检测来自网络摄像头视频的皮肤。我几乎可以工作，但是，在输出视频时，它会显示9个＆＃34;皮肤＆＃34;面具而不是一个。似乎我只是遗漏了一些简单的东西，但我无法弄清楚。

以下代码：

# first let's train the data
data, labels = ReadData()
classifier = TrainTree(data, labels)

# get the webcam. The input is either a video file or the camera number
# since using laptop webcam (only 1 cam), input is 0. A 2nd cam would be input 1
camera = cv2.VideoCapture(0)

while True:
    # reads in the current frame
    # .read() returns True if frame read correctly, and False otherwise
    ret, frame = camera.read()   # frame.shape: (480,640,3)

    if ret:
        # reshape the frame to follow format of training data (rows*col, 3)
        data = np.reshape(frame, (frame.shape[0] * frame.shape[1], 3))
        bgr = np.reshape(data, (data.shape[0], 1, 3))
        hsv = cv2.cvtColor(np.uint8(bgr), cv2.COLOR_BGR2HSV)
        # once we have converted to HSV, we reshape back to original shape of (245057,3)
        data = np.reshape(hsv, (hsv.shape[0], 3))
        predictedLabels = classifier.predict(data)

        # the AND operator applies the skinMask to the image
        # predictedLabels consists of 1 (skin) and 2 (non-skin), needs to change to 0 (non-skin) and 255 (skin)
        predictedMask = (-(predictedLabels - 1) + 1) * 255   # predictedMask.shape: (307200,)

        # resize to match frame shape
        imgLabels = np.resize(predictedMask, (frame.shape[0], frame.shape[1], 3))   # imgLabels.shape: (480,640,3)
        # masks require 1 channel, not 3, so change from BGR to GRAYSCALE
        imgLabels = cv2.cvtColor(np.uint8(imgLabels), cv2.COLOR_BGR2GRAY)   # imgLabels.shape: (480,640)

        # do bitwsie AND to pull out skin pixels. All skin pixels are anded with 255 and all others are 0
        skin = cv2.bitwise_and(frame, frame, mask=imgLabels) # skin.shape: (480,640,3)
        # show the skin in the image along with the mask, show images side-by-side
        # **********THE BELOW LINE OUTPUTS 9 screens of the skin mask instead of just 1  ****************
        cv2.imshow("images", np.hstack([frame, skin]))

        # if the 'q' key is pressed, stop the loop
        if cv2.waitKey(1) & 0xFF == ord("q"):
            break
    else:
        break

# release the video capture
camera.release()
cv2.destroyAllWindows()

Answer 1

您正在使用位图。要了解他们持有什么，cv2.imshow他们个人。然后你会看到（字面上）数据出错的地方。

现在，罪魁祸首很可能是np.resize()：

np.resize(a, new_shape)

返回具有指定形状的新数组。

如果新阵列大于原始阵列，则新阵列   充满了a 的重复副本。请注意，此行为是   与a.resize（new_shape）不同，后者用零填充而不是   重复a。

要缩放位图（=在努力保留相同的可视图像时调整大小），请按OpenCV: Geometric Transformations of Images使用cv2.resize()。

cv2.imshow显示9个屏幕而不是1个

1 个答案: