反复推理后,模型推理运行时间增加

时间:2018-07-08 01:17:19

标签: python performance tensorflow

我正在编码一个tensorflow项目,在该项目中,我正在手动编辑每个权重和偏差,因此我像使用字典一样在旧tensorflow中设置权重和偏差,而不是使用tf.layers.dense并让tensorflow负责更新权重。 (这是我想出的最干净的方法,尽管可能并不理想)

我在每次迭代中向固定模型提供相同的数据,但是在整个程序执行过程中运行时间会增加。

我从代码中删除了几乎所有内容,因此我可以看到问题出在哪里,但我不明白是什么导致运行时间增加。

---Games took   2.6591222286224365 seconds ---
---Games took   3.290001153945923 seconds ---
---Games took   4.250034332275391 seconds ---
---Games took   5.190149307250977 seconds ---

编辑:我已经设法通过使用一个占位符来减少运行时间,该占位符没有向图形添加其他节点,但是运行时间仍然以较低的速率增加。我想消除这种运行时间的增长。 (一段时间后,它从0.1秒变为超过1秒)

这是我的完整代码:

import numpy as np
import tensorflow as tf
import time

n_inputs = 9
n_class = 9

n_hidden_1 = 20

population_size = 10
weights = []
biases = []
game_steps = 20 #so we can see performance loss faster

# 2 games per individual
games_in_generation = population_size/2


def generate_initial_population(my_population_size):
    my_weights = []
    my_biases = []

    for key in range(my_population_size):
        layer_weights = {
            'h1': tf.Variable(tf.truncated_normal([n_inputs, n_hidden_1], seed=key)),
            'out': tf.Variable(tf.truncated_normal([n_hidden_1, n_class], seed=key))
        }
        layer_biases = {
            'b1': tf.Variable(tf.truncated_normal([n_hidden_1], seed=key)),
            'out': tf.Variable(tf.truncated_normal([n_class], seed=key))
        }
        my_weights.append(layer_weights)
        my_biases.append(layer_biases)
    return my_weights, my_biases


weights, biases = generate_initial_population(population_size)
data = tf.placeholder(dtype=tf.float32) #will add shape later

def model(x):
    out_layer = tf.add(tf.matmul([biases[1]['b1']], weights[1]['out']),  biases[1]['out'])
    return out_layer


def play_game():


     model_input = [0] * 9
     model_out = model(data)

     for game_step in range(game_steps):

        move = sess.run(model_out, feed_dict={data: model_input})[0]


sess = tf.Session()
sess.run(tf.global_variables_initializer())
while True:
    start_time = time.time()
    for _ in range(int(games_in_generation)):
        play_game()
    print("---Games took   %s seconds ---" % (time.time() - start_time))

2 个答案:

答案 0 :(得分:1)

此代码中发生了一些奇怪的事情,因此要给您一个真正能够解决潜在问题的答案将非常棘手。但是,我可以解决您所观察到的运行时间的增长。下面,我修改了您的代码以提取输入模式生成并从游戏循环中调用model

import numpy as np
import tensorflow as tf
import time

n_inputs = 9
n_class = 9

n_hidden_1 = 20

population_size = 10
weights = []
biases = []
game_steps = 20 #so we can see performance loss faster

# 2 games per individual
games_in_generation = population_size/2


def generate_initial_population(my_population_size):
    my_weights = []
    my_biases = []

    for key in range(my_population_size):
        layer_weights = {
            'h1': tf.Variable(tf.truncated_normal([n_inputs, n_hidden_1], seed=key)),
            'out': tf.Variable(tf.truncated_normal([n_hidden_1, n_class], seed=key))
        }
        layer_biases = {
            'b1': tf.Variable(tf.truncated_normal([n_hidden_1], seed=key)),
            'out': tf.Variable(tf.truncated_normal([n_class], seed=key))
        }
        my_weights.append(layer_weights)
        my_biases.append(layer_biases)
    return my_weights, my_biases


weights, biases = generate_initial_population(population_size)


def model(x):
    out_layer = tf.add(tf.matmul([biases[1]['b1']], weights[1]['out']),  biases[1]['out'])
    return out_layer


def play_game():

    # Extract input pattern generation.
    model_input = np.float32([[0]*9])
    model_out = model(model_input)

    for game_step in range(game_steps):

            start_time = time.time()
            move = sess.run(model_out)[0]

            # print("---Step took   %s seconds ---" % (time.time() - start_time))


sess = tf.Session()
sess.run(tf.global_variables_initializer())
for _ in range(5):
    start_time = time.time()
    for _ in range(int(games_in_generation)):
        play_game()
    print("---Games took   %s seconds ---" % (time.time() - start_time))

如果运行,此代码应为您提供以下内容:

---Games took   0.42223644256591797 seconds ---
---Games took   0.13168787956237793 seconds ---
---Games took   0.2452383041381836 seconds ---
---Games took   0.20023465156555176 seconds ---
---Games took   0.19905781745910645 seconds ---

显然,这可以解决您观察到的运行时间增长。它还将最大观察到的运行时间减少了一个数量级!发生这种情况的原因是,每次调用model时,您实际上是在创建一堆tf.Tensor对象,并试图将其添加到图形中。这种误解是常见的,并且是由于您试图在命令式python代码中使用Tensors造成的,就像它们是python变量一样。我建议在继续操作之前先检查所有graphs guide

还必须注意,这不是将值传递到TensorFlow中的图形的正确方法。我可以看到您想在游戏的每次迭代期间将不同的值传递给模型,但是您无法通过将值传递给python函数来实现这一点。您必须在模型图中创建一个tf.placeholder,并将要模型处理的值加载到该占位符上。有很多方法可以执行此操作,但是您可以找到一个示例here。希望对您有所帮助!

答案 1 :(得分:1)

我要添加另一个答案,因为对问题的最新编辑产生了实质性的变化。您仍然可以看到运行时间有所增长,因为您在model中仍然多次呼叫sess。您只是降低了向图中添加节点的频率。您需要做的是为要构建的每个模型创建一个新会话,并在完成后关闭每个会话。为此,我已经修改了您的代码:

import numpy as np
import tensorflow as tf
import time


n_inputs = 9
n_class = 9

n_hidden_1 = 20

population_size = 10
weights = []
biases = []
game_steps = 20 #so we can see performance loss faster

# 2 games per individual
games_in_generation = population_size/2


def generate_initial_population(my_population_size):
    my_weights = []
    my_biases = []

    for key in range(my_population_size):
        layer_weights = {
            'h1': tf.Variable(tf.truncated_normal([n_inputs, n_hidden_1], seed=key)),
            'out': tf.Variable(tf.truncated_normal([n_hidden_1, n_class], seed=key))
        }
        layer_biases = {
            'b1': tf.Variable(tf.truncated_normal([n_hidden_1], seed=key)),
            'out': tf.Variable(tf.truncated_normal([n_class], seed=key))
        }
        my_weights.append(layer_weights)
        my_biases.append(layer_biases)
    return my_weights, my_biases



def model(x):
    out_layer = tf.add(tf.matmul([biases[1]['b1']], weights[1]['out']),  biases[1]['out'])
    return out_layer


def play_game(sess):

    model_input = [0] * 9

    model_out = model(data)

    for game_step in range(game_steps):

        move = sess.run(model_out, feed_dict={data: model_input})[0]

while True:

    for _ in range(int(games_in_generation)):

        # Reset the graph.
        tf.reset_default_graph()

        weights, biases = generate_initial_population(population_size)
        data = tf.placeholder(dtype=tf.float32) #will add shape later

        # Create session.
        with tf.Session() as sess:

            sess.run(tf.global_variables_initializer())

            start_time = time.time()

            play_game(sess)

            print("---Games took   %s seconds ---" % (time.time() - start_time))

            sess.close()

我在这里所做的工作是将对play_game的调用包装在with范围内定义的会话中,并在对sess.close的调用之后以play_game退出该会话。我还重置了默认图形。我已经运行了数百次迭代,却没有看到运行时间的增加。