在GPU上进行培训比在CPU上进行培训要慢得多-为什么以及如何加快速度?

时间:2020-06-27 16:19:36

标签: python gpu conv-neural-network google-colaboratory

我正在使用Google Colab的CPU和GPU训练卷积神经网络。

这是网络的体系结构:

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d (Conv2D)              (None, 62, 126, 32)       896       
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 31, 63, 32)        0         
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 29, 61, 32)        9248      
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 14, 30, 32)        0         
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 12, 28, 64)        18496     
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 6, 14, 64)         0         
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 4, 12, 64)         36928     
_________________________________________________________________
max_pooling2d_3 (MaxPooling2 (None, 2, 6, 64)          0         
_________________________________________________________________
flatten (Flatten)            (None, 768)               0         
_________________________________________________________________
dropout (Dropout)            (None, 768)               0         
_________________________________________________________________
lambda (Lambda)              (None, 1, 768)            0         
_________________________________________________________________
dense (Dense)                (None, 1, 256)            196864    
_________________________________________________________________
dense_1 (Dense)              (None, 1, 8)              2056      
_________________________________________________________________
permute (Permute)            (None, 8, 1)              0         
_________________________________________________________________
dense_2 (Dense)              (None, 8, 36)             72        
=================================================================
Total params: 264,560
Trainable params: 264,560
Non-trainable params: 0

因此,这是一个非常小的网络,但具有特定的输出,形状为(8, 36),因为我想识别车牌图像上的字符。

我使用以下代码来训练网络:

model.fit_generator(generator=training_generator,
                    validation_data=validation_generator,
                    steps_per_epoch = num_train_samples // 128,
                    validation_steps = num_val_samples // 128,
                    epochs = 10)

生成器将图像调整为(64, 128)。这是关于生成器的代码:

class DataGenerator(Sequence):

    def __init__(self, x_set, y_set, batch_size):
        self.x, self.y = x_set, y_set
        self.batch_size = batch_size

    def __len__(self):
        return math.ceil(len(self.x) / self.batch_size)

    def __getitem__(self, idx):
        batch_x = self.x[idx * self.batch_size:(idx + 1) *
        self.batch_size]
        batch_y = self.y[idx * self.batch_size:(idx + 1) *
        self.batch_size]

        return np.array([
            resize(imread(file_name), (64, 128))
               for file_name in batch_x]), np.array(batch_y)

在CPU上,一个纪元需要70-90分钟。在GPU(149瓦)上,其耗时是CPU的5倍。

  1. 您知道吗,为什么要花这么长时间?生成器有问题吗?
  2. 我可以以某种方式加快此过程吗?

编辑:这将链接到我的笔记本:https://colab.research.google.com/drive/1ux9E8DhxPxtgaV60WUiYI2ew2s74Xrwh?usp=sharing

我的数据存储在我的Google云端硬盘中。训练数据集包含105 k图像和验证数据集76 k。总而言之,我有1.8 GB的数据。

我应该将数据存储在另一个地方吗?

非常感谢!

1 个答案:

答案 0 :(得分:1)

我认为您没有启用GPU

enter image description here

转到db.artysci.insert({ imie: 'Nik', nazwisko: 'Kershaw', rok_debiutu: 1983, kraj_pochodzenia: ['Wielka Brytania'], gatunek: 'pop', album: [{ tytul:"Human Racing", rok_edycji:1990, gatunek: 'trash metal', typ_nosnika: 'CD', utwor: [{ numer: 1, tytul_utworu: 'Dancing Girls', dlugosc_trwania: 3.46 }, { numer: 2, tytul_utworu: 'Wouldn’t It Be Good', dlugosc_trwania: 4.32 }, { numer: 3, tytul_utworu: 'Drum Talk', dlugosc_trwania: 3.10 }, { numer: 4, tytul_utworu: 'Bogart', dlugosc_trwania: 4.38 } ] } }) -> Edit并选择Notebook Settings。然后点击GPU