我有短视频文件夹和带图像的文件夹。大多数图像都来自其中一个视频,但它们可能不完全相同(不同大小,噪音,因压缩而丢失细节等)。我的目标是将每张图片与拍摄的视频进行匹配。到目前为止,我使用OpenCV库加载一个视频并计算每个视频帧和每个图像之间的SSIM分数。我存储了每张图片的最高SSIM分数。然后,我将拍摄具有最高SSIM分数的图像,将其与视频相关联,然后再次为第二个视频运行该功能。
这是我的代码:
import cv2
import numpy as np
from skimage.measure import compare_ssim
import sqlite3
#screenshots - list that contains dict(id=screenshot id, image=jpeg image data)
#video_file - str - path to video file
def generate_matches(screenshots, video_file):
for screenshot in screenshots:
screenshot["cv_img"] = cv2.imdecode(np.fromstring(screenshot["image"], np.uint8), 0)
screenshot["best_match"] = dict(score=0, frame=0)
screenshot.pop('image', None) #remove jpg data from RAM
vidcap = cv2.VideoCapture(video_file)
success,image = vidcap.read()
count = 1
while success:
image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
for screenshot in screenshots:
c_image = cv2.resize(image, screenshot["cv_img"].shape[1::-1])
score = compare_ssim(screenshot["cv_img"], c_image, full=False)
if score > screenshot["best_match"]["score"]:
screenshot["best_match"] = dict(score=score,frame=count)
count += 1
success,image = vidcap.read()
if count % 500 == 0:
print("Frame {}".format(count))
print("Last Frame {}".format(count))
for screenshot in screenshots:
c.execute("INSERT INTO matches(screenshot_id, file, match, frame) VALUE (?,?,?,?)",
(screenshot["id"], video_file, screenshot["best_match"]["score"], screenshot["best_match"]["frame"]))
generate_matches(list_of_screenshots, "video1.mp4")
generate_matches(list_of_screenshots, "video2.mp4")
...
这个算法似乎足以将视频与图像相关联,但即使我使用更多线程,它也很慢。有没有办法让它更快?也许不同的算法或视频和图像的一些预处理?我会为任何想法感到高兴!
答案 0 :(得分:0)
根据sascha的建议,我计算了所有屏幕截图的视频和数据中所有帧的dhashes(source),并使用汉明距离(source进行了比较)。
def dhash(image, hashSize=16): #hashSize=16 worked best for me
# resize the input image, adding a single column (width) so we
# can compute the horizontal gradient
resized = cv2.resize(image, (hashSize + 1, hashSize))
# compute the (relative) horizontal gradient between adjacent
# column pixels
diff = resized[:, 1:] > resized[:, :-1]
# convert the difference image to a hash
return sum([2 ** i for (i, v) in enumerate(diff.flatten()) if v])
def hamming(a, b):
return bin(a^b).count('1')
此解决方案快速且足够精确以满足我的需求。如果我使用different hashing function(例如OpenCV的pHash),结果很可能会得到改善,但我无法在OpenCV python biding中找到它们。