tensorflow运行时间在CPU上稳定,但在GPU上增加

时间:2019-07-23 18:58:08

标签: python performance tensorflow gpu slowdown

我有一个问题,我在计时每个批次要花多长时间。当它在CPU上运行时,运行迭代所需的时间最初是GPU的10倍,但它是一致的,但是当我在GPU上运行时,时间却不断增加。这是示例代码:

def feedward_model(graph, input_x_placeholder):
    with graph.as_default():
       # nn_layer takes input nodes and output y    
       y = nn_layer(input_x_placeholder)
       return y
def train_model(graph, y_true_placeholder, predicted_y):
    with graph.as_default():
        cross_entropy = tf.losses.sparse_softmax_cross_entropy(labels=y_true_placeholder, logits=predicted_y)
        train_step = tf.train.AdamOptimizer(learning_rate).minimize(cross_entropy)
        return train_step, cross_entropy

def timeit(f):
    @wraps(f)
    def wrap(*args, **kw):
        ts = time()
        result = f(*args, **kw)
        te = time()
        delta = te-ts
        print('func:%r took: %2.4f sec' %(f.__name__, delta))
        return result
    return wrap

g = tf.Graph()

with g.as_default():
   # placeholder block
   train_x = tf.placeholder(tf.float32)
   train_y = tf.placeholder(tf.int32)

def feed_dict():
   # data batching block, x and y will be generated in batches:
    x = np.array() 
    y = np.array()
    return {train_x: x, train_y: y}

with tf.Session(graph=g, config=tf.ConfigProto(allow_soft_placement=True, log_device_placement=True)) as sess:
    run_options = tf.RunOptions(trace_level=tf.RunOptions.FULL_TRACE)
    run_metadata = tf.RunMetadata()

    with graph.as_default():
        pred_y = feedforward_model(graph, train_x)    
        train_step, cross_entropy = train_model(graph, train_y, pred_y)

    @timeit
    def train_run():
       _, loss = session.run([train_step, cross_entropy], feed_dict=feed_dict(), options=run_options, run_metadata=run_metadata)

    for i in range(1000):
        train_run()

在标准输出中,我花了时间运行每个迭代enter image description here

我删除了必须与CPU通信的所有可能阶段,即日志记录tf.summary。我想不出任何原因,同一张图与CPU一致,但是运行时在GPU上开始增加了约200次迭代。

[更新]

我发现了问题..我没有删除的最终日志是

run_options = tf.RunOptions(trace_level=tf.RunOptions.FULL_TRACE)
run_metadata = tf.RunMetadata()

现在我从session.run中删除了这两个,在GPU上运行每个迭代的时间已经稳定。我想每次调用session.run时,FULL_TRACE会变长,因此时间会线性增加。但是,为什么GPU运行时间只会开始增加200左右?以及为什么它不影响CPU上的运行?

0 个答案:

没有答案