我是tensorflow的新手。以下代码可以成功运行,没有任何错误。在前10行输出中,计算速度很快,输出(在最后一行中定义)逐行飞行。然而,随着迭代的增加,计算变得越来越慢,最终变得无法容忍。所以我想知道是否有任何可以加快这一点的修改。
以下是此代码的简要说明: 此代码将单个隐藏层神经网络应用于数据集。它旨在找到速率[0]和速率[1]的最佳参数,这些参数将影响损失函数。在训练的每个步骤中,将一个元组馈送到模型,并立即评估元组的准确性(这种数据在现实世界中作为流传递)。
import tensorflow as tf
import numpy as np
n_hidden=50
n_input=37
n_output=2
data_raw=np.genfromtxt(r'data.csv',delimiter=",",dtype=None)
data_info=np.genfromtxt(r'data2.csv',delimiter=",",dtype=None)
def pre_process( tuple):
ans = []
temp = [0 for i in range(24)]
temp[int(tuple[0])] = 1
# np.append(ans,np.array(temp))
ans.extend(temp)
temp = [0 for i in range(7)]
temp[int(tuple[1]) - 1] = 1
ans.extend(temp)
# np.append(ans,np.array(temp))
temp = [0 for i in range(3)]
temp[int(tuple[3])] = 1
ans.extend(temp)
temp = [0 for i in range(2)]
temp[int(tuple[4])] = 1
ans.extend(temp)
ans.extend([int(tuple[5])])
return np.array(ans)
x=tf.placeholder(tf.float32, shape=[1,n_input])
y_=tf.placeholder(tf.float32,shape=[n_output])
y_r=tf.placeholder(tf.float32,shape=[n_output])
W1=tf.Variable(tf.random_uniform([n_input, n_hidden]))
b1=tf.Variable(tf.zeros([n_hidden]))
W2=tf.Variable(tf.zeros([n_hidden,n_output]))
b2=tf.Variable(tf.zeros([n_output]))
logits_1 = tf.matmul(x, W1) + b1
relu_layer= tf.nn.relu(logits_1)
logits_2 = tf.matmul(relu_layer, W2) + b2
correct_prediction = tf.equal(tf.argmax(logits_2,1), tf.argmax(y_,0))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
rate=[0,0]
for i in range(-100,200,10):
rate[0]=i;
for j in range(-100,i,10):
rate[1]=j
loss=tf.reduce_sum(tf.nn.softmax_cross_entropy_with_logits(labels=y_, logits=logits_2)*[rate[0],rate[1]])
# loss2=tf.reduce_sum(tf.nn.softmax_cross_entropy_with_logits(labels=y_r, logits=logits_2)*[rate[2],rate[3]])
# loss=loss1+loss2
train_step = tf.train.GradientDescentOptimizer(0.01).minimize(loss)
data_line=1
accur=0
local_local=0
remote_remote=0
local_remote=0
remote_local=0
total=0
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
for i in range(200):
# print(int(data_raw[data_line][0]),data_info[i][0])
if i>100:
total+=1
if int(data_raw[data_line][0])==data_info[i][0]:
sess.run(train_step,feed_dict={x:pre_process(data_info[i]).reshape(1,-1),y_:[1,0],y_r:[0,1]})
# print(sess.run(logits_2,{x:pre_process(data_info[i]).reshape(1,-1), y_: #[1,0]}))
data_line+=1;
if data_line==len(data_raw):
break
if i>100:
acc=accuracy.eval(feed_dict={x: pre_process(data_info[i]).reshape(1,-1), y_: [1,0], y_r:[0,1]})
local_local+=acc
local_remote+=1-acc
accur+=acc
else:
sess.run(train_step,feed_dict={x:pre_process(data_info[i]).reshape(1,-1),y_:[0,1], y_r:[1,0]})
# print(sess.run(logits_2,{x: pre_process(data_info[i]).reshape(1,-1), y_: #[0,1]}))
if i>100:
acc=accuracy.eval(feed_dict={x: pre_process(data_info[i]).reshape(1,-1), y_: [0,1], y_r:[1,0]})
remote_remote+=acc
remote_local+=1-acc
accur+=acc
print("correctness: (%.3d,%.3d): \t%.2f %.2f %.2f %.2f %.2f" % (rate[0],rate[1],accur/total,local_local/total,local_remote/total,remote_local/total,remote_remote/total))
答案 0 :(得分:4)
虽然GPhilo的答案解决了为什么运行代码变得越来越慢的问题,但实际上,该解决方案将导致一次又一次地创建计算图,这是不好的。
以下两行代码(GPhilo也提到过)会在每次迭代时不断向图表添加操作。
loss=tf.reduce_sum(tf.nn.softmax_cross_entropy_with_logits( \
labels=y_, logits=logits_2)*[rate[0],rate[1]])
train_step = tf.train.GradientDescentOptimizer(0.01).minimize(loss)
正如我所看到的,您有两个值rate[0], rate[1]
需要提供给您的图表。为什么不通过placeholder
提供这两个值,只定义一次图表。开始运行Session
后,您不应在图表中添加更多操作。此外,您不应该考虑初始化Session以进行迭代。
检查此修改后的代码(仅限重要部分)
# To clear previously created graph (if any) present in memory.
tf.reset_default_graph()
x=tf.placeholder(tf.float32, shape=[1,n_input])
y_=tf.placeholder(tf.float32,shape=[n_output])
y_r=tf.placeholder(tf.float32,shape=[n_output])
# Add these two placeholders (Assuming they are single float value)
rate0 = tf.placeholder(tf.float32, shape = [])
rate1 = tf.placeholder(tf.float32, shape = [])
W1=tf.Variable(tf.random_uniform([n_input, n_hidden]))
....
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
# Bring this code outside from loop (Note replacement of rate[0] with placeholder)
loss=tf.reduce_sum(tf.nn.softmax_cross_entropy_with_logits(labels=y_, \
logits=logits_2) * [rate0, rate1])
train_step = tf.train.GradientDescentOptimizer(0.01).minimize(loss)
# Instantiate session only once.
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
# Move the subsequent looping code inside.
rate=[0,0]
for i in range(-100,200,10):
rate[0]=i;
完成此修改后,只要您的Session
运行train_step
,就需要在feed_dict
中提供这两个额外的占位符。
例如:
sess.run(train_step,feed_dict={x:pre_process(data_info[i]).reshape(1,-1),
y_:[1,0],y_r:[0,1], rate0: rate[0], rate1: rate[1]})
通过这种方式,您不会为每次迭代创建图形,事实上这段代码将比GPhilo的解决方案更快。
答案 1 :(得分:2)
每次运行train_step = tf.train.GradientDescentOptimizer(0.01).minimize(loss)
时,您都会向图表中添加(相当多)一些操作,这些操作会随着程序循环的增加而变得越来越大。图表越大,执行速度越慢。
将您的模型定义放在循环中'每次开始新的迭代时,身体并调用tf.reset_default_graph()
:
rate=[0,0]
for i in range(-100,200,10):
rate[0]=i;
for j in range(-100,i,10):
tf.reset_default_graph()
x=tf.placeholder(tf.float32, shape=[1,n_input])
y_=tf.placeholder(tf.float32,shape=[n_output])
y_r=tf.placeholder(tf.float32,shape=[n_output])
W1=tf.Variable(tf.random_uniform([n_input, n_hidden]))
b1=tf.Variable(tf.zeros([n_hidden]))
W2=tf.Variable(tf.zeros([n_hidden,n_output]))
b2=tf.Variable(tf.zeros([n_output]))
logits_1 = tf.matmul(x, W1) + b1
relu_layer= tf.nn.relu(logits_1)
logits_2 = tf.matmul(relu_layer, W2) + b2
correct_prediction = tf.equal(tf.argmax(logits_2,1), tf.argmax(y_,0))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
rate[1]=j
#...