急切模式非常慢(比图形模式慢22倍)

时间:2018-10-26 09:19:58

标签: tensorflow

我读到Tensorflow 2.0将有一些重大变化,其中很大一部分将是eager-execution [1],所以我尝试使用tensorflow的eager模式。

我从github-repo上获取了一个代码,并尝试以eager-mode运行它(但是,没有像建议的那样使用Keras-Model / Layers)。 事实证明,它相当慢。因此,我尝试了不同的修改,并将其与模型的原始来源(图形模式)进行了比较。结果是,图形模式比快速模式快22倍。我完全清楚,图形模式更快,但是这个数字呢?

是否总是这样?还是需要对变量进行一些特殊的修改/配置才能获得与图形模式相当的性能?

这两个变体的源代码都可以在[2]中找到。

谢谢!

急切模式:

# With 
#  with tf.device("/gpu:0"):
#    ...
#
# Runtime is 0.35395
# Runtime is 0.12711
# Runtime is 0.12438
# Runtime is 0.12428
# Runtime is 0.12572
# Runtime is 0.12593
# Runtime is 0.12505
# Runtime is 0.12527
# Runtime is 0.12418
# Runtime is 0.12340

图形模式:

# Runtime is 0.81241
# Runtime is 0.00573
# Runtime is 0.00573
# Runtime is 0.00570
# Runtime is 0.00555
# Runtime is 0.00564
# Runtime is 0.00545
# Runtime is 0.00540
# Runtime is 0.00591
# Runtime is 0.00574

[1] https://groups.google.com/a/tensorflow.org/forum/#!topic/developers/JHDpgRyFVUs

[2] https://gist.github.com/lhlmgr/f6709e5aba4a5314b5221d58232b09bd

1 个答案:

答案 0 :(得分:2)

使用急切的执行可能意味着消除使用TensorFlow图所养成的一些习惯,因为曾经运行过一次的代码段(例如,构建该图以计算损失的Python函数)将重复运行(同一Python函数现在将计算损失)每次迭代)。

我粗略地看了所提供的代码链接,并发现了一些容易获得的胜利,使用标准的Python分析工具也可能会看到这些胜利。您可能要使用这些(cProfile,pyspy等)

例如,Keras网络当前实现为:

class NFModel(tf.keras.Model):
  def __init__(self, *args, **kwargs):
    super().__init__(*args, **kwargs)

  def call(self, *args, **kwargs):
    num_layers = 6
    d, r = 2, 2
    bijectors = []

    for i in range(num_layers):
      with tf.variable_scope('bijector_%d' % i):
        V = tf.get_variable('V', [d, r], dtype=DTYPE)  # factor loading
        shift = tf.get_variable('shift', [d], dtype=DTYPE)  # affine shift
        L = tf.get_variable('L', [d * (d + 1) / 2], dtype=DTYPE)  # lower triangular
        bijectors.append(tfb.Affine(
          scale_tril=tfd.fill_triangular(L),
          scale_perturb_factor=V,
          shift=shift,
        ))

        alpha = tf.get_variable('alpha', [], dtype=DTYPE)
        abs_alpha = tf.abs(alpha) + .01
        bijectors.append(LeakyReLU(alpha=abs_alpha))

    base_dist = tfd.MultivariateNormalDiag(loc=tf.zeros([2], DTYPE))
    mlp_bijector = tfb.Chain(list(reversed(bijectors[:-1])), name='2d_mlp_bijector')
    dist = tfd.TransformedDistribution(distribution=base_dist, bijector=mlp_bijector)

相反,如果您一次在__init__中创建变量,并且在每次网络调用时避免进行tf.get_variable调用,那么您应该会看到很大的进步。

class NFModel(tf.keras.Model):
  def __init__(self, *args, **kwargs):
    super(NFModel, self).__init__(*args, **kwargs)
    num_layers = 6
    d, r = 2, 2
    self.num_layers = num_layers
    self.V = [tf.get_variable('V', [d, r], dtype=DTYPE)  for _ in range(num_layers)]
    self.shift = [tf.get_variable('shift', [d], dtype=DTYPE)   for _ in range(num_layers)]
    self.L = [tf.get_variable('L', [d * (d + 1) / 2], dtype=DTYPE)  for _ in range(num_layers)]
    self.alpha = [tf.get_variable('alpha', [], dtype=DTYPE) for _ in range(num_layers)]


  def call(self, *args, **kwargs):
    bijectors = []

    for i in range(self.num_layers):
      V = self.V[i]
      shift = self.shift[i]
      L = self.L[i]
      bijectors.append(tfb.Affine(
        scale_tril=tfd.fill_triangular(L),
        scale_perturb_factor=V,
        shift=shift,
      ))

      alpha = self.alpha[i]
      abs_alpha = tf.abs(alpha) + .01
      bijectors.append(LeakyReLU(alpha=abs_alpha))

    base_dist = tfd.MultivariateNormalDiag(loc=tf.zeros([2], DTYPE))
    mlp_bijector = tfb.Chain(list(reversed(bijectors[:-1])), name='2d_mlp_bijector')
    dist = tfd.TransformedDistribution(distribution=base_dist, bijector=mlp_bijector)

    return {"dist": dist}

可能还有其他这样的轻松优势,使用配置文件工具可以向正确的方向轻推。

另外,请注意,根据RFC

,TF 2.0不太关注“渴望执行”,而更多地关注与图形的交互方式。

希望有帮助。