我正在尝试重现dlib的frontal_face_detector()的训练过程。 我正在使用相同的数据集(来自 http://dlib.net/files/data/dlib_face_detector_training_data.tar.gz)正如dlib所说,他们使用了正面和轮廓面的结合+他们的反思。
我的问题是: 1.整个数据集的内存使用率非常高(30 + Gb) 2.与frontal_face_detector的80-90(未用于训练的图像子集上的测试)相比,对部分数据集的训练不会产生非常高的召回率,50-60%。 3.探测器在低分辨率图像上工作严重,因此无法检测到深度超过1-1.5米的面。 4.使用SVM的C参数训练运行时间显着增加,我必须增加以获得更好的回忆率(我怀疑这只是过度拟合)
我在trainig的原始动机是 一个。通过例如以下方式获得适应安装摄像机的特定环境的能力艰难的负面采矿。 湾通过将80x80窗口缩小到64x64甚至48x48来改进深度+运行时间检测。
我是在正确的道路上吗?我想念什么吗?请帮忙......
答案 0 :(得分:3)
使用的训练参数记录在dlib的代码http://dlib.net/dlib/image_processing/frontal_face_detector.h.html中的注释中。供参考:
It is built out of 5 HOG filters. A front looking, left looking, right looking,
front looking but rotated left, and finally a front looking but rotated right one.
Moreover, here is the training log and parameters used to generate the filters:
The front detector:
trained on mirrored set of labeled_faces_in_the_wild/frontal_faces.xml
upsampled each image by 2:1
used pyramid_down<6>
loss per missed target: 1
epsilon: 0.05
padding: 0
detection window size: 80 80
C: 700
nuclear norm regularizer: 9
cell_size: 8
num filters: 78
num images: 4748
Train detector (precision,recall,AP): 0.999793 0.895517 0.895368
singular value threshold: 0.15
The left detector:
trained on labeled_faces_in_the_wild/left_faces.xml
upsampled each image by 2:1
used pyramid_down<6>
loss per missed target: 2
epsilon: 0.05
padding: 0
detection window size: 80 80
C: 250
nuclear norm regularizer: 8
cell_size: 8
num filters: 63
num images: 493
Train detector (precision,recall,AP): 0.991803 0.86019 0.859486
singular value threshold: 0.15
The right detector:
trained left-right flip of labeled_faces_in_the_wild/left_faces.xml
upsampled each image by 2:1
used pyramid_down<6>
loss per missed target: 2
epsilon: 0.05
padding: 0
detection window size: 80 80
C: 250
nuclear norm regularizer: 8
cell_size: 8
num filters: 66
num images: 493
Train detector (precision,recall,AP): 0.991781 0.85782 0.857341
singular value threshold: 0.19
The front-rotate-left detector:
trained on mirrored set of labeled_faces_in_the_wild/frontal_faces.xml
upsampled each image by 2:1
used pyramid_down<6>
rotated left 27 degrees
loss per missed target: 1
epsilon: 0.05
padding: 0
detection window size: 80 80
C: 700
nuclear norm regularizer: 9
cell_size: 8
num images: 4748
singular value threshold: 0.12
The front-rotate-right detector:
trained on mirrored set of labeled_faces_in_the_wild/frontal_faces.xml
upsampled each image by 2:1
used pyramid_down<6>
rotated right 27 degrees
loss per missed target: 1
epsilon: 0.05
padding: 0
detection window size: 80 80
C: 700
nuclear norm regularizer: 9
cell_size: 8
num filters: 89
num images: 4748
Train detector (precision,recall,AP): 1 0.897369 0.897369
singular value threshold: 0.15
参数是什么以及如何设置它们都在dlib文档中进行了解释。还有一篇论文描述了训练算法:Max-Margin Object Detection。
是的,运行培训师可能需要大量内存。