TPU的准确性似乎不及GPU

时间:2019-12-11 02:08:21

标签: python tensorflow machine-learning neural-network tpu

我正在尝试与TPU合作训练一个简单的CNN。但是,看起来它没有达到与使用GPU相同的准确性。我想知道我是否缺少什么。请告知。

数据集为MNIST 框架:tf.keras(Tensorflow 2.0) 以下是详细信息:

神经网络:

  model = Sequential()
  model.add(Conv2D(filters = 32, kernel_size = (3, 3), activation='relu', input_shape=(28,28,1), use_bias = False))
  model.add(Conv2D(filters = 64, kernel_size=(3,3), activation = 'relu', use_bias = False))
  model.add(Conv2D(filters = 128, kernel_size=(3,3), activation = 'relu', use_bias = False))
  model.add(MaxPool2D())
  model.add(Conv2D(filters = 32, kernel_size=(1,1), activation = 'relu', use_bias = False))
  model.add(Conv2D(filters = 64, kernel_size=(3,3), activation = 'relu', use_bias = False))
  model.add(Conv2D(filters = 128, kernel_size=(3,3), activation = 'relu', use_bias = False))
  model.add(Conv2D(filters = 10, kernel_size = (1, 1) , activation='relu', use_bias = False))
  model.add(Conv2D(filters = 10, kernel_size = (7, 7), use_bias = False))
  model.add(Flatten())
  model.add(Activation('softmax'))
  model.compile(loss='categorical_crossentropy',
             optimizer='adam',
             metrics=['accuracy'])

使用GPU(在Colab中)的培训日志:

model.fit(X_train, Y_train, batch_size=128, epochs=10, validation_data = (X_test, Y_test), verbose=1)
Train on 60000 samples, validate on 10000 samples
Epoch 1/10
60000/60000 [==============================] - 19s 316us/sample - loss: 0.2102 - accuracy: 0.9328 - val_loss: 0.0627 - val_accuracy: 0.9780
Epoch 2/10
60000/60000 [==============================] - 16s 259us/sample - loss: 0.0550 - accuracy: 0.9838 - val_loss: 0.0374 - val_accuracy: 0.9879
Epoch 3/10
60000/60000 [==============================] - 16s 259us/sample - loss: 0.0390 - accuracy: 0.9878 - val_loss: 0.0339 - val_accuracy: 0.9904
Epoch 4/10
60000/60000 [==============================] - 15s 258us/sample - loss: 0.0309 - accuracy: 0.9905 - val_loss: 0.0324 - val_accuracy: 0.9899
Epoch 5/10
60000/60000 [==============================] - 15s 257us/sample - loss: 0.0261 - accuracy: 0.9916 - val_loss: 0.0314 - val_accuracy: 0.9912
Epoch 6/10
60000/60000 [==============================] - 15s 257us/sample - loss: 0.0218 - accuracy: 0.9931 - val_loss: 0.0322 - val_accuracy: 0.9908
Epoch 7/10
60000/60000 [==============================] - 15s 258us/sample - loss: 0.0183 - accuracy: 0.9942 - val_loss: 0.0325 - val_accuracy: 0.9911
Epoch 8/10
60000/60000 [==============================] - 15s 257us/sample - loss: 0.0159 - accuracy: 0.9945 - val_loss: 0.0384 - val_accuracy: 0.9884
Epoch 9/10
60000/60000 [==============================] - 15s 256us/sample - loss: 0.0149 - accuracy: 0.9953 - val_loss: 0.0509 - val_accuracy: 0.9869
Epoch 10/10
60000/60000 [==============================] - 15s 256us/sample - loss: 0.0126 - accuracy: 0.9961 - val_loss: 0.0295 - val_accuracy: 0.9923

使用TPU(在Colab中)的培训日志:

model.fit(X_train, Y_train, batch_size=128, steps_per_epoch=468, epochs=15, validation_data = (X_test, Y_test), verbose=1)
Epoch 1/15
461/468 [============================>.] - ETA: 0s - loss: 0.2352 - accuracy: 0.9256INFO:tensorflow:Running validation at fit epoch: 0
79/79 [==============================] - 59s 752ms/step
79/79 [==============================] - 59s 752ms/step
468/468 [==============================] - 101s 215ms/step - loss: 0.2328 - accuracy: 0.9264 - val_loss: 0.0670 - val_accuracy: 0.9786
Epoch 2/15
466/468 [============================>.] - ETA: 0s - loss: 0.0573 - accuracy: 0.9827INFO:tensorflow:Running validation at fit epoch: 1
79/79 [==============================] - 62s 790ms/step
79/79 [==============================] - 62s 790ms/step
468/468 [==============================] - 103s 221ms/step - loss: 0.0572 - accuracy: 0.9827 - val_loss: 0.0462 - val_accuracy: 0.9850
Epoch 3/15
460/468 [============================>.] - ETA: 0s - loss: 0.0407 - accuracy: 0.9878INFO:tensorflow:Running validation at fit epoch: 2
79/79 [==============================] - 67s 844ms/step
79/79 [==============================] - 67s 844ms/step
468/468 [==============================] - 111s 237ms/step - loss: 0.0402 - accuracy: 0.9880 - val_loss: 0.0477 - val_accuracy: 0.9836
Epoch 4/15
466/468 [============================>.] - ETA: 0s - loss: 0.0342 - accuracy: 0.9898INFO:tensorflow:Running validation at fit epoch: 3
79/79 [==============================] - 67s 848ms/step
79/79 [==============================] - 67s 848ms/step
468/468 [==============================] - 110s 234ms/step - loss: 0.0341 - accuracy: 0.9898 - val_loss: 0.0453 - val_accuracy: 0.9860
Epoch 5/15
461/468 [============================>.] - ETA: 0s - loss: 0.0279 - accuracy: 0.9916INFO:tensorflow:Running validation at fit epoch: 4
79/79 [==============================] - 71s 895ms/step
79/79 [==============================] - 71s 895ms/step
468/468 [==============================] - 117s 250ms/step - loss: 0.0279 - accuracy: 0.9916 - val_loss: 0.0398 - val_accuracy: 0.9874
Epoch 6/15
466/468 [============================>.] - ETA: 0s - loss: 0.0225 - accuracy: 0.9930INFO:tensorflow:Running validation at fit epoch: 5
79/79 [==============================] - 79s 1s/step
79/79 [==============================] - 79s 1s/step
468/468 [==============================] - 129s 276ms/step - loss: 0.0224 - accuracy: 0.9930 - val_loss: 0.0390 - val_accuracy: 0.9877
Epoch 7/15
459/468 [============================>.] - ETA: 0s - loss: 0.0204 - accuracy: 0.9936INFO:tensorflow:Running validation at fit epoch: 6
79/79 [==============================] - 78s 984ms/step
79/79 [==============================] - 78s 984ms/step
468/468 [==============================] - 129s 277ms/step - loss: 0.0205 - accuracy: 0.9936 - val_loss: 0.0320 - val_accuracy: 0.9896
Epoch 8/15
465/468 [============================>.] - ETA: 0s - loss: 0.0165 - accuracy: 0.9945INFO:tensorflow:Running validation at fit epoch: 7
79/79 [==============================] - 81s 1s/step
79/79 [==============================] - 81s 1s/step
468/468 [==============================] - 135s 289ms/step - loss: 0.0164 - accuracy: 0.9945 - val_loss: 0.0403 - val_accuracy: 0.9889
Epoch 9/15
466/468 [============================>.] - ETA: 0s - loss: 0.0156 - accuracy: 0.9947INFO:tensorflow:Running validation at fit epoch: 8
79/79 [==============================] - 82s 1s/step
79/79 [==============================] - 82s 1s/step
468/468 [==============================] - 138s 295ms/step - loss: 0.0156 - accuracy: 0.9947 - val_loss: 0.0464 - val_accuracy: 0.9875
Epoch 10/15
460/468 [============================>.] - ETA: 0s - loss: 0.0139 - accuracy: 0.9954INFO:tensorflow:Running validation at fit epoch: 9
79/79 [==============================] - 89s 1s/step
79/79 [==============================] - 89s 1s/step
468/468 [==============================] - 147s 315ms/step - loss: 0.0137 - accuracy: 0.9954 - val_loss: 0.0431 - val_accuracy: 0.9903
Epoch 11/15
467/468 [============================>.] - ETA: 0s - loss: 0.0116 - accuracy: 0.9963INFO:tensorflow:Running validation at fit epoch: 10
79/79 [==============================] - 84s 1s/step
79/79 [==============================] - 84s 1s/step
468/468 [==============================] - 140s 298ms/step - loss: 0.0116 - accuracy: 0.9963 - val_loss: 0.0409 - val_accuracy: 0.9876
Epoch 12/15
459/468 [============================>.] - ETA: 0s - loss: 0.0116 - accuracy: 0.9963INFO:tensorflow:Running validation at fit epoch: 11
79/79 [==============================] - 85s 1s/step
79/79 [==============================] - 85s 1s/step
468/468 [==============================] - 148s 316ms/step - loss: 0.0117 - accuracy: 0.9963 - val_loss: 0.0466 - val_accuracy: 0.9870
Epoch 13/15
460/468 [============================>.] - ETA: 0s - loss: 0.0104 - accuracy: 0.9965INFO:tensorflow:Running validation at fit epoch: 12
79/79 [==============================] - 98s 1s/step
79/79 [==============================] - 98s 1s/step
468/468 [==============================] - 162s 347ms/step - loss: 0.0103 - accuracy: 0.9965 - val_loss: 0.0473 - val_accuracy: 0.9890
Epoch 14/15
461/468 [============================>.] - ETA: 0s - loss: 0.0095 - accuracy: 0.9966INFO:tensorflow:Running validation at fit epoch: 13
79/79 [==============================] - 101s 1s/step
79/79 [==============================] - 101s 1s/step
468/468 [==============================] - 168s 360ms/step - loss: 0.0094 - accuracy: 0.9967 - val_loss: 0.0421 - val_accuracy: 0.9905
Epoch 15/15
460/468 [============================>.] - ETA: 0s - loss: 0.0094 - accuracy: 0.9969INFO:tensorflow:Running validation at fit epoch: 14
79/79 [==============================] - 107s 1s/step
79/79 [==============================] - 107s 1s/step
468/468 [==============================] - 177s 379ms/step - loss: 0.0093 - accuracy: 0.9970 - val_loss: 0.0445 - val_accuracy: 0.9893
<tensorflow.python.keras.callbacks.History at 0x7f17b43c5f98>

谢谢

0 个答案:

没有答案