我试图在TPU上使用keras.preprocessing.image.ImageDataGenerator,但是我从第一个纪元就得到了这个错误。相同的代码适用于jupyter笔记本,但是需要花费数小时进行培训。
我的模型:
model = Sequential()
model.add(Conv2D(32, (3, 3), input_shape=input_shape))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(32, (3, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(64, (3, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Flatten())
model.add(Dense(220))
model.add(Activation('relu'))
model.add(Dropout(0.4))
model.add(Dense(120))
model.add(Activation('softmax'))
优化器
opt = tf.train.AdamOptimizer(learning_rate)
model.compile(
optimizer=opt,
loss='categorical_crossentropy',
metrics=['acc'])
将Keras转换为TPU
try:
device_name = os.environ['COLAB_TPU_ADDR']
TPU_ADDRESS = 'grpc://' + device_name
print('Found TPU at: {}'.format(TPU_ADDRESS))
except KeyError:
print('TPU not found')
tpu_model = tf.contrib.tpu.keras_to_tpu_model(
model,
strategy=tf.contrib.tpu.TPUDistributionStrategy(
tf.contrib.cluster_resolver.TPUClusterResolver(TPU_ADDRESS)))
ImageDataGenerator
train_datagen = ImageDataGenerator(
rescale=1. / 255,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True)
# this is the augmentation configuration we will use for testing:
# only rescaling
test_datagen = ImageDataGenerator(rescale=1. / 255)
train_generator = train_datagen.flow_from_directory(
train_data_dir,
target_size=(img_width, img_height),
batch_size=batch_size,
class_mode='categorical')#binary ,categorical
validation_generator = test_datagen.flow_from_directory(
validation_data_dir,
target_size=(img_width, img_height),
batch_size=batch_size,
class_mode='categorical')
模型拟合
model_fit=tpu_model.fit_generator(
train_generator,
epochs=50,
steps_per_epoch=60,
)
我收到此错误
Epoch 1/50 15/33 [============> .................]-ETA:8秒-损失: 4.7722-acc:0.0083INFO:tensorflow:新的输入形状; (重新)编译:mode = train(核数8),[TensorSpec(shape =(0,),dtype = tf.int32, name ='core_id_60'),TensorSpec(shape =(0,128,128,3), dtype = tf.float32,name ='conv2d_3_input_20'),TensorSpec(shape =(0, 120),dtype = tf.float32,名称='activation_13_target_30')] -------------------------------------------------- ------------------------- InvalidArgumentError追踪(最近的呼叫 持续) /usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py 在_create_c_op中(图形,node_def,输入,control_inputs)1658
尝试: -> 1659 c_op = c_api.TF_FinishOperation(op_desc)1660除了errors.InvalidArgumentError为e:InvalidArgumentError:维度0的切片索引0超出范围。对于 输入形状为[0],[1],“ strided_slice_19”(操作:“ StridedSlice”) [1],[1]并具有计算的输入张量:input [1] = <0>,input [2] = <1>,输入[3] = <1>。
在处理上述异常期间,发生了另一个异常:
ValueError跟踪(最近的呼叫 最后)17帧 /usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py 在_create_c_op中(图形,node_def,输入,control_inputs)1660
除外errors.InvalidArgumentError为e:1661#转换为 向后兼容的ValueError。 -> 1662提高ValueError(str(e))1663 1664返回c_opValueError:维度0的切片索引0超出范围。对于 输入形状为[0],[1],“ strided_slice_19”(操作:“ StridedSlice”) [1],[1]并具有计算的输入张量:input [1] = <0>,input [2] = <1>,输入[3] = <1>。