将--train_batch_size 2增加到--train_batch_size 3会导致Mozilla DeepSpeech不再训练。为什么?

时间:2018-07-01 18:58:50

标签: speech-recognition mozilla-deepspeech

--train_batch_size 2增加到--train_batch_size 3会导致Mozilla DeepSpeech不再训练。有什么可以解释的?


具体来说,如果我跑步

./DeepSpeech.py --train_files data/common-voice-v1/cv-valid-train.csv --dev_files \
 data/common-voice-v1/cv-valid-dev.csv \
--test_files data/common-voice-v1/cv-valid-test.csv  \
 --log_level 0 --limit_train 10000 --train_batch_size 2 --train True

我得到set_name: train

D Starting queue runners...
D Queue runners started.
I STARTING Optimization
D step: 77263
D epoch: 61
D target epoch: 75
D steps per epoch: 1250
D number of batches in train set: 5000
D batches per job: 4
D batches per step: 4
D number of jobs in train set: 1250
D number of jobs already trained in first epoch: 1013
D Computing Job (ID: 2, worker: 0, epoch: 61, set_name: train)...
D Starting batch...
D Finished batch step 77264.
D Sending Job (ID: 2, worker: 0, epoch: 61, set_name: train)...
D Computing Job (ID: 3, worker: 0, epoch: 61, set_name: train)...
D Starting batch...
D Finished batch step 77265.
D Sending Job (ID: 3, worker: 0, epoch: 61, set_name: train)...
D Computing Job (ID: 4, worker: 0, epoch: 61, set_name: train)...
D Starting batch...
D Finished batch step 77266.
D Sending Job (ID: 4, worker: 0, epoch: 61, set_name: train)...
[...]

但是,如果我跑步:

./DeepSpeech.py --train_files data/common-voice-v1/cv-valid-train.csv --dev_files \
 data/common-voice-v1/cv-valid-dev.csv \
--test_files data/common-voice-v1/cv-valid-test.csv  \
 --log_level 0 --limit_train 10000 --train_batch_size 3 --train True

我得到set_name: test

D Starting queue runners...
D Queue runners started.
D step: 77263
D epoch: 92
D target epoch: 75
D steps per epoch: 833
D number of batches in train set: 3334
D batches per job: 4
D batches per step: 4
D number of jobs in train set: 833
D number of jobs already trained in first epoch: 627
D Computing Job (ID: 2, worker: 0, epoch: 92, set_name: test)...
D Starting batch...
D Finished batch step 77263.
D Sending Job (ID: 2, worker: 0, epoch: 92, set_name: test)...
D Computing Job (ID: 3, worker: 0, epoch: 92, set_name: test)...
D Starting batch...
D Finished batch step 77263.
D Sending Job (ID: 3, worker: 0, epoch: 92, set_name: test)...
D Computing Job (ID: 4, worker: 0, epoch: 92, set_name: test)...
D Starting batch...
D Finished batch step 77263.
D Sending Job (ID: 4, worker: 0, epoch: 92, set_name: test)...
D Computing Job (ID: 5, worker: 0, epoch: 92, set_name: test)...
D Starting batch...
D Finished batch step 77263.
D Sending Job (ID: 5, worker: 0, epoch: 92, set_name: test)...
[...]

我使用4个Nvidia GeForce GTX 1080训练Mozilla DeepSpeech。

1 个答案:

答案 0 :(得分:0)

该问题由指出 lissyx是未清除检查点目录。这在问题详细信息的日志中很明显,例如D Finished batch step 77263.,而如果清理了检查点目录,则批处理步骤应为0。结果,当使用--train_batch_size > 2运行时,它直接跳到了测试阶段。

Ubuntu上检查点目录的默认位置是:/home/[username]/.local/share/deepspeech/checkpoints。可以使用--checkpoint_dir参数更改检查点的位置。