Question

我正在尝试重现dlib的frontal_face_detector（）的训练过程。我正在使用相同的数据集（来自 http://dlib.net/files/data/dlib_face_detector_training_data.tar.gz）正如dlib所说，他们使用了正面和轮廓面的结合+他们的反思。

我的问题是： 1.整个数据集的内存使用率非常高（30 + Gb） 2.与frontal_face_detector的80-90（未用于训练的图像子集上的测试）相比，对部分数据集的训练不会产生非常高的召回率，50-60％。 3.探测器在低分辨率图像上工作严重，因此无法检测到深度超过1-1.5米的面。 4.使用SVM的C参数训练运行时间显着增加，我必须增加以获得更好的回忆率（我怀疑这只是过度拟合）

我在trainig的原始动机是一个。通过例如以下方式获得适应安装摄像机的特定环境的能力艰难的负面采矿。湾通过将80x80窗口缩小到64x64甚至48x48来改进深度+运行时间检测。

我是在正确的道路上吗？我想念什么吗？请帮忙......

Answer 1

使用的训练参数记录在dlib的代码http://dlib.net/dlib/image_processing/frontal_face_detector.h.html中的注释中。供参考：

        It is built out of 5 HOG filters. A front looking, left looking, right looking, 
    front looking but rotated left, and finally a front looking but rotated right one.

    Moreover, here is the training log and parameters used to generate the filters:
    The front detector:
        trained on mirrored set of labeled_faces_in_the_wild/frontal_faces.xml
        upsampled each image by 2:1
        used pyramid_down<6> 
        loss per missed target: 1
        epsilon: 0.05
        padding: 0
        detection window size: 80 80
        C: 700
        nuclear norm regularizer: 9
        cell_size: 8
        num filters: 78
        num images: 4748
        Train detector (precision,recall,AP): 0.999793 0.895517 0.895368 
        singular value threshold: 0.15

    The left detector:
        trained on labeled_faces_in_the_wild/left_faces.xml
        upsampled each image by 2:1
        used pyramid_down<6> 
        loss per missed target: 2
        epsilon: 0.05
        padding: 0
        detection window size: 80 80
        C: 250
        nuclear norm regularizer: 8
        cell_size: 8
        num filters: 63
        num images: 493
        Train detector (precision,recall,AP): 0.991803  0.86019 0.859486 
        singular value threshold: 0.15

    The right detector:
        trained left-right flip of labeled_faces_in_the_wild/left_faces.xml
        upsampled each image by 2:1
        used pyramid_down<6> 
        loss per missed target: 2
        epsilon: 0.05
        padding: 0
        detection window size: 80 80
        C: 250
        nuclear norm regularizer: 8
        cell_size: 8
        num filters: 66
        num images: 493
        Train detector (precision,recall,AP): 0.991781  0.85782 0.857341 
        singular value threshold: 0.19

    The front-rotate-left detector:
        trained on mirrored set of labeled_faces_in_the_wild/frontal_faces.xml
        upsampled each image by 2:1
        used pyramid_down<6> 
        rotated left 27 degrees
        loss per missed target: 1
        epsilon: 0.05
        padding: 0
        detection window size: 80 80
        C: 700
        nuclear norm regularizer: 9
        cell_size: 8
        num images: 4748
        singular value threshold: 0.12

    The front-rotate-right detector:
        trained on mirrored set of labeled_faces_in_the_wild/frontal_faces.xml
        upsampled each image by 2:1
        used pyramid_down<6> 
        rotated right 27 degrees
        loss per missed target: 1
        epsilon: 0.05
        padding: 0
        detection window size: 80 80
        C: 700
        nuclear norm regularizer: 9
        cell_size: 8
        num filters: 89
        num images: 4748
        Train detector (precision,recall,AP):        1 0.897369 0.897369 
        singular value threshold: 0.15

参数是什么以及如何设置它们都在dlib文档中进行了解释。还有一篇论文描述了训练算法：Max-Margin Object Detection。

是的，运行培训师可能需要大量内存。

再现dlib frontal_face_detector（）培训

1 个答案: