Question

我对在Google合作实验室进行模型训练不是很熟悉，我目前正在尝试使用TPU训练Keras（tinyYolov3）模型，据我所知，我需要使用其中一种可用的TPU设备

    resolver = tf.contrib.cluster_resolver.TPUClusterResolver(tpu_address)
    tf.contrib.distribute.initialize_tpu_system(resolver)
    strategy = tf.contrib.distribute.TPUStrategy(resolver)

并使用构建和编译我的模型使用strategy.scope（）：

    model = create_tiny_model(input_shape, anchors, num_classes,
    freeze_body, )
    
    model.load_weights(weights_path, by_name=True, skip_mismatch=True)
    model.compile(optimizer=Adam(lr=1e-3), loss={
    # use custom yolo_loss Lambda layer.
    'yolo_loss': lambda y_true, y_pred: y_pred})
    model.summary()
    print('Load weights {}.'.format(weights_path))

出现问题时，以下代码似乎

    def data_generator(annotation_lines, batch_size, input_shape, anchors, num_classes):
    '''data generator for fit_generator'''
    n = len(annotation_lines)
    i = 0
    while True:
        image_data = []
        box_data = []
        for b in range(batch_size):
            if i==0:
                np.random.shuffle(annotation_lines)
            image, box = get_random_data(annotation_lines[i], input_shape, random=True)
            image_data.append(image)
            box_data.append(box)
            i = (i+1) % n
        image_data = np.array(image_data)
        box_data = np.array(box_data)
        y_true = preprocess_true_boxes(box_data, input_shape, anchors, num_classes)
        yield [image_data, *y_true], np.zeros(batch_size)

    def data_generator_wrapper(annotation_lines, batch_size, input_shape, anchors, num_classes):
     n = len(annotation_lines)
     if n==0 or batch_size<=0: return None
     return data_generator(annotation_lines, batch_size, input_shape, anchors, num_classes)

错误消息

         60 
         61 def _pop_per_thread_mode():
    ---> 62   ops.get_default_graph()._distribution_strategy_stack.pop(-1)  # pylint:     disable=protected-access
         63 
         64 

    IndexError: pop from empty list

我也尝试过 TPU_WORKER ='grpc：//'+ os.environ ['COLAB_TPU_ADDR'] tf.logging.set_verbosity（tf.logging.INFO）

    yolo_model = tf.contrib.tpu.keras_to_tpu_model(
        yolo_model,
        strategy=tf.contrib.tpu.TPUDistributionStrategy(
                    tf.contrib.cluster_resolver.TPUClusterResolver(TPU_WORKER)))

并获得以下消息： ValueError：图层在非批处理维度上具有可变的形状。 TPU型号必须所有操作的形状都是不变的。

    You may have to specify `input_length` for RNN/TimeDistributed layers.

    Layer: <keras.engine.topology.InputLayer object at 0x7f5cbab05dd8>
    Input shape: (None, None, None, 3)
    Output shape: (None, None, None, 3)

如何在Google Colab的TPU上训练Keras模型？是否需要更改模型配置？（tinyYolo-v3）

0 个答案: