TensorFlow中的学习速率衰减-piecewise_constant函数错误

时间:2019-07-15 23:58:55

标签: python python-3.x tensorflow

我正在尝试使用Tiny YOLO v2的代码库。我在宣布学习率时间表时遇到以下错误。我可以看到我的step值与lr的大小相同,但是不确定什么是好的解决方案。我已经尝试过显式声明值(steps小于lr)以及导致的错误。

错误:

  

回溯(最近通话最近):     在第335行中输入文件“ scripts / train_tiny_yolo.py”       lr = tf.train.piecewise_constant(global_step,steps,lrs)     文件“ /Users/nivedithakalavakonda/Desktop/python_environments/objectdet_tf1/lib/python3.6/site-packages/tensorflow/python/training/learning_rate_decay.py”,第147行,piecewise_constant       名称=名称)     文件“ /Users/nivedithakalavakonda/Desktop/python_environments/objectdet_tf1/lib/python3.6/site-packages/tensorflow/python/training/learning_rate_decay_v2.py”,行166,在piecewise_constant中       “边界的长度应比值的长度小1”)   ValueError:边界的长度应比值的长度小1

这是我代码中的相关部分:

    base_lr = params.get('learning_rate', 1e-3)
    steps = params.get('steps', [3000, 4000, 5000])

    steps_and_lrs = []
    if steps[0] > 100:
        # Warm-up
        steps_and_lrs += [
            (25, base_lr / 100),
            (50, base_lr / 10)
        ]

    steps_and_lrs += [(step, base_lr * 10**(-i)) for i, step in enumerate(steps)]
    steps, lrs = zip(*steps_and_lrs)

    # Alternative attempt to explicitly declare lr and steps values
    # steps =( 50, 20000, 30000, 40000)
    # lrs = (1e-05, 0.0001, 0.001, 0.0001, 1e-05)

    max_iter = steps[-1]
    lr = tf.train.piecewise_constant(global_step, steps, lrs)
    np.set_printoptions(precision=3, suppress=True)

    opt = tf.train.MomentumOptimizer(lr, momentum=0.9)
    grads_and_vars = opt.compute_gradients(loss)
    clip_value = params.get('clip_gradients')

    if clip_value is not None:
        grads_and_vars = [(tf.clip_by_value(g, -clip_value, clip_value), v) for g, v in grads_and_vars]

    train_op = opt.apply_gradients(grads_and_vars,
            global_step=global_step)

    merged = tf.summary.merge_all()

我尝试了什么:

当我明确给出step和lr的值时,出现以下值错误:

  

回溯(最近通话最近):文件   第363行中的“ scripts / train_tiny_yolo.py”       grads_and_vars = [(tf.clip_by_value(g,-clip_value,clip_value),v)for grads_and_vars中的g,v]文件“ scripts / train_tiny_yolo.py”,   363行,在       grads_and_vars = [(tf.clip_by_value(g,-clip_value,clip_value),v)对于grads_and_vars中的g,v)   “ /Users/nivedithakalavakonda/Desktop/python_environments/objectdet_tf1/lib/python3.6/site-packages/tensorflow/python/util/dispatch.py​​”,   包装中的第180行       返回目标(* args,** kwargs)文件“ /Users/nivedithakalavakonda/Desktop/python_environments/objectdet_tf1/lib/python3.6/site-packages/tensorflow/python/ops/clip_ops.py”,   clip_by_value中的第69行       t = ops.convert_to_tensor(t,name =“ t”)文件“ /Users/nivedithakalavakonda/Desktop/python_environments/objectdet_tf1/lib/python3.6/site-packages/tensorflow/python/framework/ops.py”,   第1039行,在convert_to_tensor中       返回convert_to_tensor_v2(值,dtype,preferred_dtype,名称)文件   “ /Users/nivedithakalavakonda/Desktop/python_environments/objectdet_tf1/lib/python3.6/site-packages/tensorflow/python/framework/ops.py”,   第1097行,在convert_to_tensor_v2中       as_ref = False)文件“ /Users/nivedithakalavakonda/Desktop/python_environments/objectdet_tf1/lib/python3.6/site-packages/tensorflow/python/framework/ops.py”,   第1175行,位于internal_convert_to_tensor中       ret = conversion_func(值,dtype = dtype,名称=名称,as_ref = as_ref)文件   “ /Users/nivedithakalavakonda/Desktop/python_environments/objectdet_tf1/lib/python3.6/site-packages/tensorflow/python/framework/constant_op.py”,   _constant_tensor_conversion_function中的第304行       返回常量(v,dtype = dtype,name = name)文件“ /Users/nivedithakalavakonda/Desktop/python_environments/objectdet_tf1/lib/python3.6/site-packages/tensorflow/python/framework/constant_op.py”,   第245行,常量       allow_broadcast = True)文件“ /Users/nivedithakalavakonda/Desktop/python_environments/objectdet_tf1/lib/python3.6/site-packages/tensorflow/python/framework/constant_op.py”,   _constant_impl中的第283行       allow_broadcast = allow_broadcast))文件“ /Users/nivedithakalavakonda/Desktop/python_environments/objectdet_tf1/lib/python3.6/site-packages/tensorflow/python/framework/tensor_util.py”,   make_tensor_proto中的第454行       引发ValueError(“不支持任何值。”)

当前使用TensorFlow 1.13.1。

感谢您的帮助。请让我知道共享大型代码库是否会更有见识。

2 个答案:

答案 0 :(得分:0)

根据您的代码,stepslrs的大小相同。请检查提供的示例[here]。根据本文档,steps中的值数量应比lrs中的值数量少1。另外,请注意,此调度程序中存在一个错误。您可以here对其进行检查。

如果您使用的是tensorflow 2.0,下面是一个有效的示例。我尚未使用tf 1.13对此进行过测试。

import numpy as np
from tensorflow.python.keras.optimizer_v2 import learning_rate_schedule
n_step_epoch = 100
init_lr = 0.01
decay = 0.1

decay_type = 'multistep_15_25_100'
milestones = decay_type.split('_')
milestones.pop(0)
milestones = list(map(lambda x: int(x), milestones))
boundaries = np.multiply(milestones,n_step_epoch)
values = [init_lr] + [init_lr/(decay**-i) for i in  range(1,len(milestones)+1)]
learning_rate =learning_rate_schedule.PiecewiseConstantDecay(boundaries.tolist(), values)

希望这会有所帮助!

答案 1 :(得分:0)

我发现,当我确保stepslrs的大小不同时,错误是由其中包含grad的{​​{1}}值引起的。我使用以下解决方法解决了该问题:

None