在Google Colab上训练Yolov3-tiny,但在4000次迭代后停止了。我如何继续训练?

时间:2020-07-09 00:04:36

标签: python windows computer-vision yolo

(我是一个初学者),我使用yolov3-tiny.cfg和darknet53.con.74训练了该模型,因为我无法加载yolov3-tiny.weights(不确定这是否重要)。在停止之前,该模型在colab中训练了3000次迭代(几个小时)。当我使用这些权重时,模型的性能会很差(我知道微小的yolo不够精确,但这是非常不准确的),我敢肯定这是很少的迭代,但是当我加载上次保存的训练权重时开车继续训练,我明白了:

!./darknet detector train data/obj.data cfg/yolov3-tiny_training.cfg /mydrive/yolov3/yolov3-tiny_training_last.weights -dont_show

当我运行它时,我得到了:

 CUDA-version: 10010 (10010), cuDNN: 7.6.5, GPU count: 1  
 OpenCV version: 3.2.0
yolov3-tiny_training
 0 : compute_capability = 370, cudnn_half = 0, GPU: Tesla K80 
net.optimized_memory = 0 
mini_batch = 4, batch = 64, time_steps = 1, train = 1 
   layer   filters  size/strd(dil)      input                output
   0 conv     16       3 x 3/ 1    416 x 416 x   3 ->  416 x 416 x  16 0.150 BF
   1 max                2x 2/ 2    416 x 416 x  16 ->  208 x 208 x  16 0.003 BF
   2 conv     32       3 x 3/ 1    208 x 208 x  16 ->  208 x 208 x  32 0.399 BF
   3 max                2x 2/ 2    208 x 208 x  32 ->  104 x 104 x  32 0.001 BF
   4 conv     64       3 x 3/ 1    104 x 104 x  32 ->  104 x 104 x  64 0.399 BF
   5 max                2x 2/ 2    104 x 104 x  64 ->   52 x  52 x  64 0.001 BF
   6 conv    128       3 x 3/ 1     52 x  52 x  64 ->   52 x  52 x 128 0.399 BF
   7 max                2x 2/ 2     52 x  52 x 128 ->   26 x  26 x 128 0.000 BF
   8 conv    256       3 x 3/ 1     26 x  26 x 128 ->   26 x  26 x 256 0.399 BF
   9 max                2x 2/ 2     26 x  26 x 256 ->   13 x  13 x 256 0.000 BF
  10 conv    512       3 x 3/ 1     13 x  13 x 256 ->   13 x  13 x 512 0.399 BF
  11 max                2x 2/ 1     13 x  13 x 512 ->   13 x  13 x 512 0.000 BF
  12 conv   1024       3 x 3/ 1     13 x  13 x 512 ->   13 x  13 x1024 1.595 BF
  13 conv    256       1 x 1/ 1     13 x  13 x1024 ->   13 x  13 x 256 0.089 BF
  14 conv    512       3 x 3/ 1     13 x  13 x 256 ->   13 x  13 x 512 0.399 BF
  15 conv     21       1 x 1/ 1     13 x  13 x 512 ->   13 x  13 x  21 0.004 BF
  16 yolo
[yolo] params: iou loss: mse (2), iou_norm: 0.75, cls_norm: 1.00, scale_x_y: 1.00
  17 route  13                                 ->   13 x  13 x 256 
  18 conv    128       1 x 1/ 1     13 x  13 x 256 ->   13 x  13 x 128 0.011 BF
  19 upsample                 2x    13 x  13 x 128 ->   26 x  26 x 128
  20 route  19 8                               ->   26 x  26 x 384 
  21 conv    256       3 x 3/ 1     26 x  26 x 384 ->   26 x  26 x 256 1.196 BF
  22 conv     21       1 x 1/ 1     26 x  26 x 256 ->   26 x  26 x  21 0.007 BF
  23 yolo
[yolo] params: iou loss: mse (2), iou_norm: 0.75, cls_norm: 1.00, scale_x_y: 1.00
Total BFLOPS 5.449 
avg_outputs = 325057 
 Allocate additional workspace_size = 12.46 MB 
Loading weights from /mydrive/yolov3/yolov3-tiny_training_last.weights...
 seen 64, trained: 256 K-images (4 Kilo-batches_64) 
Done! Loaded 24 layers from weights-file 
Learning Rate: 0.001, Momentum: 0.9, Decay: 0.0005
 Detection layer: 16 - type = 27 
 Detection layer: 23 - type = 27 
Saving weights to /mydrive/yolov3/yolov3-tiny_training_final.weights
 Create 6 permanent cpu-threads 

有人知道如何加载最后的砝码以便继续训练吗?

1 个答案:

答案 0 :(得分:0)

为解决此问题,我在train命令的末尾添加了-clear 1。这样,如本post

中所述,模型之前训练过的图像的统计信息将被清除。