在运行TensorFlow时如何读取GPU使用情况?

时间:2019-01-22 21:13:31

标签: python tensorflow gpu

我有两个用于TensorFlow_CPU和TensorFlow_GPU的anaconda环境。我想测试CPU和GPU版本之间的速度差异。因此,在两个终端上,我正在运行相同的程序:

python grid.py

这是在训练MLP神经网络。我想看看GPU的运行速度比CPU的运行速度有多快。它们都在运行时,我打印了以下内容:

[martin@A08-R32-I196-3-FZ2LTP2 ~]$ nvidia-smi
Wed Jan 23 04:52:56 2019       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 410.79       Driver Version: 410.79       CUDA Version: 10.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Tesla P40           Off  | 00000000:02:00.0 Off |                    0 |
| N/A   29C    P0    49W / 250W |  21817MiB / 22919MiB |     14%      Default |
+-------------------------------+----------------------+----------------------+
|   1  Tesla P40           Off  | 00000000:04:00.0 Off |                    0 |
| N/A   33C    P0    49W / 250W |    231MiB / 22919MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   2  Tesla P40           Off  | 00000000:83:00.0 Off |                    0 |
| N/A   27C    P0    48W / 250W |    231MiB / 22919MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   3  Tesla P40           Off  | 00000000:84:00.0 Off |                    0 |
| N/A   35C    P0    51W / 250W |    231MiB / 22919MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0    186812      C   python                                     21807MiB |
|    1    186812      C   python                                       221MiB |
|    2    186812      C   python                                       221MiB |
|    3    186812      C   python                                       221MiB |
+-----------------------------------------------------------------------------+

这是否表示所有4个GPU都在使用,但是'0'被大量使用,而其他的则被稍微使用?与我的Ubuntu桌面上的小型2G ram GPU相比,这款功能强大的4-GPU机器似乎并没有加快执行此训练任务的速度。

编辑:完整代码如下:

from keras.models import Sequential
from keras.layers import Dense
from keras.wrappers.scikit_learn import KerasClassifier
from sklearn.model_selection import GridSearchCV

import time

import numpy as np

start_time = time.time()

# Function to create model, required for KerasClassifier
def create_model(optimizer='rmsprop', init='glorot_uniform'):
    model = Sequential()
    model.add(Dense(12, input_dim=8, kernel_initializer=init, activation='relu'))
    model.add(Dense(8, kernel_initializer=init, activation='relu'))
    model.add(Dense(1, kernel_initializer=init, activation='sigmoid'))

    model.compile(loss='binary_crossentropy', optimizer=optimizer, metrics=['accuracy'])

    return model

# fix random seed for reproducibility
seed = 7
np.random.seed(seed)

data = np.loadtxt("/home/abigail/nlp/MLMastery/DLwithPython/code/chapter_07/pima-indians-diabetes.csv", delimiter=",")
X = data[:, 0:8]
Y = data[:,8]

model = KerasClassifier(build_fn=create_model, verbose=1)

optimizers = ['rmsprop', 'adam']
inits = ['glorot_uniform', 'normal', 'uniform']
epochs = [50, 100, 150]
batches = [5, 10, 20]

param_grid = dict(optimizer=optimizers, epochs=epochs, batch_size=batches, init=inits)
grid = GridSearchCV(estimator=model, param_grid=param_grid)
grid_result = grid.fit(X, Y)

print("Best: %f using %s" % (grid_result.best_score_, grid_result.best_params_))

means = grid_result.cv_results_['mean_test_score']
stds = grid_result.cv_results_['std_test_score']
params = grid_result.cv_results_['params']

for mean, stdev, param in zip(means, stds, params):
    print("%f (%f) with %r" % (mean, stdev, param))



# your code
elapsed = int(time.time() - start_time)
print('{:02d}:{:02d}:{:02d}'.format(elapsed // 3600, (elapsed % 3600 // 60), elapsed % 60))

0 个答案:

没有答案