从tf.train.AdamOptimizer获取当前的学习率

时间:2016-05-02 19:45:57

标签: tensorflow

我想打印出我的nn的每个训练步骤的学习率。

我知道亚当有一个自适应学习率,但我有没有办法看到这个(对于张量板中的可视化)

5 个答案:

答案 0 :(得分:12)

所有优化器都有一个私有变量,它保存学习速率的值。

adagradgradient descent中,它被称为public class MathsFragment extends Fragment { BarChart bar; public static MathsFragment newInstance() { MathsFragment fragment = new MathsFragment(); return fragment; } @Override public void onCreate(Bundle savedInstanceState) { super.onCreate(savedInstanceState); } public MathsFragment() { // Required empty public constructor } @Override public View onCreateView(LayoutInflater inflater, ViewGroup container, Bundle savedInstanceState) { // Inflate the layout for this fragment View view =inflater.inflate(R.layout.fragment_maths, container, false); Typeface font = Typeface.createFromAsset(getActivity().getAssets(), "fonts/ProductSans.ttf"); bar = (BarChart)view.findViewById(R.id.bar); List<BarEntry> entries = new ArrayList<>(); entries.add(new BarEntry(0f, 100f,"Total")); entries.add(new BarEntry(1f, 82f,"Obtained")); entries.add(new BarEntry(2f, 95f,"Highest")); entries.add(new BarEntry(3f, 69f,"Average")); BarDataSet bSet = new BarDataSet(entries, "Marks"); bSet.setColors(ColorTemplate.VORDIPLOM_COLORS); ArrayList<String> barFactors = new ArrayList<>(); barFactors.add("Total"); barFactors.add("Obtained"); barFactors.add("Highest"); barFactors.add("Average"); XAxis xAxis = bar.getXAxis(); xAxis.setGranularity(1f); xAxis.setGranularityEnabled(true); BarData data = new BarData(bSet); data.setBarWidth(0.9f); // set custom bar width data.setValueTextSize(12); Description description = new Description(); description.setTextColor(R.color.colorPrimary); description.setText("All values in marks"); bar.setDescription(description); bar.setData(data); bar.setFitBars(true); // make the x-axis fit exactly all bars bar.invalidate(); // refresh bar.getXAxis().setValueFormatter(new IndexAxisValueFormatter(barFactors)); Legend l = bar.getLegend(); l.setFormSize(10f); // set the size of the legend forms/shapes l.setForm(Legend.LegendForm.CIRCLE); // set what type of form/shape should be used l.setTypeface(font); l.setTextSize(12f); l.setTextColor(Color.BLACK); List<LegendEntry> lentries = new ArrayList<>(); for (int i = 0; i < barFactors.size(); i++) { LegendEntry entry = new LegendEntry(); entry.formColor = ColorTemplate.VORDIPLOM_COLORS[i]; entry.label = barFactors.get(i); lentries.add(entry); } l.setXEntrySpace(5f); // set the space between the legend entries on the x-axis l.setYEntrySpace(5f); l.setCustom(lentries); return inflater.inflate(R.layout.fragment_maths, container, false); } } 。在adamself._learning_rate

因此,您只需要打印self._lr即可获得此值。需要Sess.run,因为它们是张量。

答案 1 :(得分:9)

Sung Kim的建议对我有用,我的确切步骤是:

lr = 0.1
step_rate = 1000
decay = 0.95

global_step = tf.Variable(0, trainable=False)
increment_global_step = tf.assign(global_step, global_step + 1)
learning_rate = tf.train.exponential_decay(lr, global_step, step_rate, decay, staircase=True)

optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate, epsilon=0.01)
trainer = optimizer.minimize(loss_function)

# Some code here

print('Learning rate: %f' % (sess.run(trainer ._lr)))

答案 2 :(得分:2)

我认为你能做的最简单的事情就是优化器的子类化。

它有几种方法,我想根据变量类型调度。常规密集变量似乎经过_apply_dense。此解决方案不适用于稀疏或其他事物。

如果查看implementation,您可以看到它将mt个EMA存储在这些“广告位”中。所以这样的事情似乎就这样做了:

class MyAdam(tf.train.AdamOptimizer):
    def _apply_dense(self, grad, var):
        m = self.get_slot(var, "m")
        v = self.get_slot(var, "v")

        m_hat = m/(1-self._beta1_power)
        v_hat = v/(1-self._beta2_power)

        step = m_hat/(v_hat**0.5 + self._epsilon_t)

        # Use a histogram summary to monitor it during training.
        tf.summary.histogram("hist", step) 

        return super(MyAdam,self)._apply_dense(grad, var)

step这里将在区间[-1,1]中,这是学习率乘以的,以确定应用于参数的实际步骤。

图表中通常没有节点,因为有一个大的training_ops.apply_adam可以完成所有工作。

这里我只是从它创建一个直方图摘要。但是你可以将它粘贴在附在对象上的字典中,然后再读它或者随意做任何事情。

将其放入mnist_deep.py,并在训练循环中添加一些摘要:

all_summaries = tf.summary.merge_all()  
file_writer = tf.summary.FileWriter("/tmp/Adam")
with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    for i in range(20000):
        batch = mnist.train.next_batch(50)
        if i % 100 == 0:
            train_accuracy,summaries = sess.run(
                [accuracy,all_summaries],
                feed_dict={x: batch[0], y_: batch[1], 
                           keep_prob: 1.0})
            file_writer.add_summary(summaries, i)
            print('step %d, training accuracy %g' % (i, train_accuracy))
       train_step.run(feed_dict={x: batch[0], y_: batch[1], keep_prob: 0.5})

在TensorBoard中生成下图:

TensorBoard Histogram tab, showing 8 different histogram plots

答案 3 :(得分:2)

Tensorflow 2中:

optimizer = tf.keras.optimizers.Adagrad(learning_rate=0.1)  # or any other optimizer
print(optimizer.learning_rate.numpy())  # or print(optimizer.lr.numpy())

注意:这为您提供了基础学习率。有关自适应学习率的更多详细信息,请参阅此answer

答案 4 :(得分:1)

在TensorFlow源中,Adam优化器的当前lr计算如下:

    lr = (lr_t * math_ops.sqrt(1 - beta2_power) / (1 - beta1_power))

所以,试试吧:

    current_lr = (optimizer._lr_t * tf.sqrt(1 - 
    optimizer._beta2_power) / (1 - optimizer._beta1_power))

    eval_current_lr = sess.run(current_lr)