我有两个用于TensorFlow_CPU和TensorFlow_GPU的anaconda环境。我想测试CPU和GPU版本之间的速度差异。因此,在两个终端上,我正在运行相同的程序:
python grid.py
这是在训练MLP神经网络。我想看看GPU的运行速度比CPU的运行速度有多快。它们都在运行时,我打印了以下内容:
[martin@A08-R32-I196-3-FZ2LTP2 ~]$ nvidia-smi
Wed Jan 23 04:52:56 2019
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 410.79 Driver Version: 410.79 CUDA Version: 10.0 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla P40 Off | 00000000:02:00.0 Off | 0 |
| N/A 29C P0 49W / 250W | 21817MiB / 22919MiB | 14% Default |
+-------------------------------+----------------------+----------------------+
| 1 Tesla P40 Off | 00000000:04:00.0 Off | 0 |
| N/A 33C P0 49W / 250W | 231MiB / 22919MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 2 Tesla P40 Off | 00000000:83:00.0 Off | 0 |
| N/A 27C P0 48W / 250W | 231MiB / 22919MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 3 Tesla P40 Off | 00000000:84:00.0 Off | 0 |
| N/A 35C P0 51W / 250W | 231MiB / 22919MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 186812 C python 21807MiB |
| 1 186812 C python 221MiB |
| 2 186812 C python 221MiB |
| 3 186812 C python 221MiB |
+-----------------------------------------------------------------------------+
这是否表示所有4个GPU都在使用,但是'0'被大量使用,而其他的则被稍微使用?与我的Ubuntu桌面上的小型2G ram GPU相比,这款功能强大的4-GPU机器似乎并没有加快执行此训练任务的速度。
编辑:完整代码如下:
from keras.models import Sequential
from keras.layers import Dense
from keras.wrappers.scikit_learn import KerasClassifier
from sklearn.model_selection import GridSearchCV
import time
import numpy as np
start_time = time.time()
# Function to create model, required for KerasClassifier
def create_model(optimizer='rmsprop', init='glorot_uniform'):
model = Sequential()
model.add(Dense(12, input_dim=8, kernel_initializer=init, activation='relu'))
model.add(Dense(8, kernel_initializer=init, activation='relu'))
model.add(Dense(1, kernel_initializer=init, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer=optimizer, metrics=['accuracy'])
return model
# fix random seed for reproducibility
seed = 7
np.random.seed(seed)
data = np.loadtxt("/home/abigail/nlp/MLMastery/DLwithPython/code/chapter_07/pima-indians-diabetes.csv", delimiter=",")
X = data[:, 0:8]
Y = data[:,8]
model = KerasClassifier(build_fn=create_model, verbose=1)
optimizers = ['rmsprop', 'adam']
inits = ['glorot_uniform', 'normal', 'uniform']
epochs = [50, 100, 150]
batches = [5, 10, 20]
param_grid = dict(optimizer=optimizers, epochs=epochs, batch_size=batches, init=inits)
grid = GridSearchCV(estimator=model, param_grid=param_grid)
grid_result = grid.fit(X, Y)
print("Best: %f using %s" % (grid_result.best_score_, grid_result.best_params_))
means = grid_result.cv_results_['mean_test_score']
stds = grid_result.cv_results_['std_test_score']
params = grid_result.cv_results_['params']
for mean, stdev, param in zip(means, stds, params):
print("%f (%f) with %r" % (mean, stdev, param))
# your code
elapsed = int(time.time() - start_time)
print('{:02d}:{:02d}:{:02d}'.format(elapsed // 3600, (elapsed % 3600 // 60), elapsed % 60))