TensorFlow的GPU使用率问题

时间:2019-04-29 00:09:19

标签: python tensorflow keras gpu tf.keras

我正尝试使用TensoFlow训练网络(一个lrcn即CNN,然后是LSTM),

model=Sequential();                          
..
.
.
# my model 

model.compile(loss='categorical_crossentropy',
              optimizer='adam',
              metrics=['accuracy'])

model.fit_generator(generator=training_generator,
                validation_data=validation_generator,
                use_multiprocessing=True,
                workers=6)

我正在按照this link创建生成器类。看起来像这样:

class DataGenerator(tf.keras.utils.Sequence):
# 'Generates data for Keras'
def __init__(self, list_ids, labels, batch_size = 8, dim = (15, 16, 3200), n_channels = 1,
             n_classes = 3, shuffle = True):
    # 'Initialization'
    self.dim = dim
    self.batch_size = batch_size
    self.labels = labels
    self.list_IDs = list_ids
    self.n_channels = n_channels
    self.n_classes = n_classes
    self.shuffle = shuffle
    self.on_epoch_end()

def __len__(self):
    # 'Denotes the number of batches per epoch'
    return int(np.floor(len(self.list_IDs) / self.batch_size))

def __getitem__(self, index):
    # 'Generate one batch of data'
    # Generate indexes of the batch
    indexes = self.indexes[index * self.batch_size:(index + 1) * self.batch_size]

    # Find list of IDs
    list_ids_temp = [self.list_IDs[k] for k in indexes]

    # Generate data
    X, y = self.__data_generation(list_ids_temp)

    return X, y

def on_epoch_end(self):
    # Updates indexes after each epoch'
    self.indexes = np.arange(len(self.list_IDs))
    if self.shuffle:
        np.random.shuffle(self.indexes)

def __data_generation(self, list_ids_temp):
    # 'Generates data containing batch_size samples' # X : (n_samples, *dim, n_channels)
    # Initialization
    X = np.empty((self.batch_size, *self.dim, self.n_channels))
    y = np.empty(self.batch_size, dtype = int)

    sequences = np.empty((15, 16, 3200, self.n_channels))

    # Generate data
    for i, ID in enumerate(list_ids_temp):
        with h5py.File(ID) as file:
            _data = list(file['decimated_data'])

        _npData = np.array(_data)
        _allSequences = np.transpose(_npData)

        # a 16 x 48000 matrix is split into 15 sequences of size 16x3200
        for sq in range(15):
            sequences[sq, :, :, :] = np.reshape(_allSequences[0:16, i:i + 3200], (16, 3200, 1))
        # Store sample
        X[i, ] = sequences

        # Store class
        y[i] = self.labels[ID]

    return X, tf.keras.utils.to_categorical(y, num_classes = self.n_classes)

这可以正常工作并且代码可以运行,但是,我注意到GPU的使用率保持为0。当我将log_device_placement设置为true时,它显示了分配给GPU的操作。但是,当我使用任务管理器或nvidia-smi监视GPU时,看不到任何活动。

但是当我不使用DataGenerator类而仅使用model.fit()使用如下所示的生成方式时,我注意到程序确实使用了GPU。

data = np.random.random((550, num_seq, rows, cols, ch))    
label = np.random.random((num_of_samples,1))

_data['train'] = data[0:500,:]
_label['train'] = label[0:500, :]

_data['valid'] = data[500:,:]
_label['valid']=label[500:,:]

model.fit(data['train'],
                    labels['train'],
                    epochs = FLAGS.epochs,
                    batch_size = FLAGS.batch_size,
                    validation_data = (data['valid'], labels['valid']),
                    shuffle = True,
                    callbacks = [tb, early_stopper, checkpoint])'

所以我猜测可能不是因为我的NVIDIA驱动程序安装错误或TensorFlow安装不正确,这是我在同时运行这两个代码时收到的消息,表明TF可以识别我的GPU {{3} },这使我相信DataGenerator类和/或fit_generator()

有问题

有人可以帮我指出我在做什么错吗?

我在装有GTX 1050Ti的Windows 10计算机上使用TensorFlow 1.10和cUDA 9。

0 个答案:

没有答案