tf.global_variables_initializer()的位置

时间:2018-01-29 15:21:34

标签: python tensorflow deep-learning

我是初学者,深入学习并坚持这个问题。

import tensorflow as tf
import numpy as np
import pandas as pd
from sklearn.preprocessing import LabelEncoder
from sklearn.utils import  shuffle
from sklearn.model_selection import train_test_split

#define the one hot encode function
def one_hot_encode(labels):
  n_labels = len(labels)
  n_unique_labels = len(np.unique(labels))
  one_hot_encode = np.zeros((n_labels,n_unique_labels))
  one_hot_encode[np.arange(n_labels), labels] = 1
  return one_hot_encode

#Read the sonar dataset
df = pd.read_csv('sonar.csv')
print(len(df.columns))
X = df[df.columns[0:60]].values
y=df[df.columns[60]]
#encode the dependent variable containing categorical values
encoder = LabelEncoder()
encoder.fit(y)
y = encoder.transform(y)
Y = one_hot_encode(y)

#Transform the data in training and testing
X,Y = shuffle(X,Y,random_state=1)
train_x,test_x,train_y,test_y = train_test_split(X,Y,test_size=0.20,       random_state=42)


#define and initialize the variables to work with the tensors
learning_rate = 0.1
training_epochs = 1000

 #Array to store cost obtained in each epoch
 cost_history = np.empty(shape=[1],dtype=float)

 n_dim = X.shape[1]
 n_class = 2

x = tf.placeholder(tf.float32,[None,n_dim])
W = tf.Variable(tf.zeros([n_dim,n_class]))
b = tf.Variable(tf.zeros([n_class]))

#initialize all variables.


#define the cost function
y_ = tf.placeholder(tf.float32,[None,n_class])
y = tf.matmul(x, W)+ b
 init = tf.global_variables_initializer()#wrong position
cost_function =       tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=y,labels=y_))

training_step = tf.train.AdamOptimizer(learning_rate).minimize(cost_function)
 init = tf.global_variables_initializer()#correct position
 #initialize the session

 sess = tf.Session()

  sess.run(init)
  mse_history = []

  #calculate the cost for each epoch
 for epoch in range(training_epochs):
sess.run(training_step,feed_dict={x:train_x,y_:train_y})
cost = sess.run(cost_function,feed_dict={x: train_x,y_: train_y})
cost_history = np.append(cost_history,cost)
print('epoch : ', epoch,  ' - ', 'cost: ', cost)

 pred_y = sess.run(y, feed_dict={x: test_x})
 print(pred_y) 
#Calculate Accuracy
 correct_prediction = tf.equal(tf.argmax(pred_y,1), tf.argmax(test_y,1))
 accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
 print(sess.run(accuracy))
 sess.close()

在上面的代码中如果我使用init = tf.global_variables_initializer() 在 AdamOptimizer 之上然后它会给出错误,但如果我之后使用它 AdamOptimizer 然后它运行正常。是什么原因? 虽然它在两个位置都能正常使用 GradientDescentOptimizer

2 个答案:

答案 0 :(得分:2)

查看documentation init = tf.global_variables_initializer()init = tf.variables_initializer(tf.global_variables())相同

tf.train.AdamOptimizer需要初始化的一些内部变量(均值统计等)

<tf.Variable 'beta1_power:0' shape=() dtype=float32_ref>
<tf.Variable 'beta2_power:0' shape=() dtype=float32_ref>
<tf.Variable 'x/Adam:0' shape=(2, 1) dtype=float32_ref>    # 1st moment vector
<tf.Variable 'x/Adam_1:0' shape=(2, 1) dtype=float32_ref>  # 2nd moment vector

documentation告诉您如何应用更新。

相反,vanilla梯度下降优化器tf.train.GradientDescentOptimizer不依赖于任何变量。有区别。 现在,在tf.train.AdamOptimizer可以使用其变量之前,需要在某个时刻初始化这些变量。

要创建初始化所有必需变量的操作init,此操作init需要知道运行程序所需的变量。因此,需要将放在 tf.train.AdamOptimizer之后。

如果您要将init = tf.global_variables_initializer() 放在 tf.train.AdamOptimizer之前

init_op = tf.variables_initializer(tf.global_variables())
optimize_op = tf.train.AdamOptimizer(0.1).minimize(cost_function)

你会得到

Attempting to use uninitialized value beta1_power

告诉您,tf.train.AdamOptimizer尝试访问尚未初始化的<tf.Variable 'beta1_power:0' shape=() dtype=float32_ref>

所以

# ...
... = tf.train.AdamOptimizer(0.1).minimize(cost_function)
# ...
init = tf.global_variables_initializer()

是唯一正确的方法。您可以通过放置

来检查哪些变量可以初始化
for variable in tf.global_variables():
    print(variable)

在源代码中。

考虑最小化二次形式0.5x'Ax + bx + c的示例。在TensorFlow中,这将是

import tensorflow as tf
import numpy as np

x = tf.Variable(np.random.rand(2, 1), dtype=tf.float32, name="x")
# we already make clear, that we are not going to optimize these variables
b = tf.constant([[5], [6]], dtype=tf.float32, name="b")
A = tf.constant([[9, 2], [2, 10]], dtype=tf.float32, name="A")

cost_function = 0.5 * tf.matmul(tf.matmul(tf.transpose(x), A), x) - tf.matmul(tf.transpose(b), x) + 42

for variable in tf.global_variables():
    print('before ADAM: global_variables_initializer would init {}'.format(variable))

optimize_op = tf.train.AdamOptimizer(0.1).minimize(cost_function)

for variable in tf.global_variables():
    print('after ADAM: global_variables_initializer would init 

{}&#39; .format(变量))

init_op = tf.variables_initializer(tf.global_variables())
with tf.Session() as sess:
    sess.run(init_op)

    for i in range(5):
        loss, _ = sess.run([cost_function, optimize_op])
        print(loss)

输出

before ADAM global_variables_initializer would init <tf.Variable 'x:0' shape=(2, 1) dtype=float32_ref>
after ADAM global_variables_initializer would init <tf.Variable 'x:0' shape=(2, 1) dtype=float32_ref>
after ADAM global_variables_initializer would init <tf.Variable 'beta1_power:0' shape=() dtype=float32_ref>
after ADAM global_variables_initializer would init <tf.Variable 'beta2_power:0' shape=() dtype=float32_ref>
after ADAM global_variables_initializer would init <tf.Variable 'x/Adam:0' shape=(2, 1) dtype=float32_ref>
after ADAM global_variables_initializer would init <tf.Variable 'x/Adam_1:0' shape=(2, 1) dtype=float32_ref>

在将tf.global_variables_initializer()放在ADAM定义init = tf.global_variables_initializer()之前时,tf.train.AdamOptimizer看不到ADAM所需的变量。使用GradientDescentOptimizer时,值为

before ADAM global_variables_initializer would init <tf.Variable 'x:0' shape=(2, 1) dtype=float32_ref>
after ADAM global_variables_initializer would init <tf.Variable 'x:0' shape=(2, 1) dtype=float32_ref>

因此在优化器之前和之后没有任何改变。

答案 1 :(得分:0)

根据我的经验,init = tf.global_variables_initializer() 只会初始化在之前声明的变量

例如,考虑以下代码:

variable_1 = tf.get_variable("v_1",[5,5],tf.float32,initializer=tf.zeros_initializer)
init = tf.global_variables_initializer()
variable_2 = tf.get_variable("v_2",[5,5],tf.float32,initializer=tf.zeros_initializer)

以下代码将在variable_1中打印数字(5x5,全为零):

with tf.Session() as sess:
    sess.run(init) 
    print(sess.run(variable_1))

但是,以下代码将产生“尝试使用未初始化的值”错误:

with tf.Session() as sess:
    sess.run(init) 
    print(sess.run(variable_2))

总结一下,在大多数情况下,只需将init = tf.global_variables_initializer()放在所有其他变量之后。