我读到Tensorflow 2.0将有一些重大变化,其中很大一部分将是eager-execution [1],所以我尝试使用tensorflow的eager模式。
我从github-repo上获取了一个代码,并尝试以eager-mode运行它(但是,没有像建议的那样使用Keras-Model / Layers)。 事实证明,它相当慢。因此,我尝试了不同的修改,并将其与模型的原始来源(图形模式)进行了比较。结果是,图形模式比快速模式快22倍。我完全清楚,图形模式更快,但是这个数字呢?
是否总是这样?还是需要对变量进行一些特殊的修改/配置才能获得与图形模式相当的性能?
这两个变体的源代码都可以在[2]中找到。
谢谢!
急切模式:
# With
# with tf.device("/gpu:0"):
# ...
#
# Runtime is 0.35395
# Runtime is 0.12711
# Runtime is 0.12438
# Runtime is 0.12428
# Runtime is 0.12572
# Runtime is 0.12593
# Runtime is 0.12505
# Runtime is 0.12527
# Runtime is 0.12418
# Runtime is 0.12340
图形模式:
# Runtime is 0.81241
# Runtime is 0.00573
# Runtime is 0.00573
# Runtime is 0.00570
# Runtime is 0.00555
# Runtime is 0.00564
# Runtime is 0.00545
# Runtime is 0.00540
# Runtime is 0.00591
# Runtime is 0.00574
[1] https://groups.google.com/a/tensorflow.org/forum/#!topic/developers/JHDpgRyFVUs
[2] https://gist.github.com/lhlmgr/f6709e5aba4a5314b5221d58232b09bd
答案 0 :(得分:2)
使用急切的执行可能意味着消除使用TensorFlow图所养成的一些习惯,因为曾经运行过一次的代码段(例如,构建该图以计算损失的Python函数)将重复运行(同一Python函数现在将计算损失)每次迭代)。
我粗略地看了所提供的代码链接,并发现了一些容易获得的胜利,使用标准的Python分析工具也可能会看到这些胜利。您可能要使用这些(cProfile,pyspy等)
例如,Keras网络当前实现为:
class NFModel(tf.keras.Model):
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
def call(self, *args, **kwargs):
num_layers = 6
d, r = 2, 2
bijectors = []
for i in range(num_layers):
with tf.variable_scope('bijector_%d' % i):
V = tf.get_variable('V', [d, r], dtype=DTYPE) # factor loading
shift = tf.get_variable('shift', [d], dtype=DTYPE) # affine shift
L = tf.get_variable('L', [d * (d + 1) / 2], dtype=DTYPE) # lower triangular
bijectors.append(tfb.Affine(
scale_tril=tfd.fill_triangular(L),
scale_perturb_factor=V,
shift=shift,
))
alpha = tf.get_variable('alpha', [], dtype=DTYPE)
abs_alpha = tf.abs(alpha) + .01
bijectors.append(LeakyReLU(alpha=abs_alpha))
base_dist = tfd.MultivariateNormalDiag(loc=tf.zeros([2], DTYPE))
mlp_bijector = tfb.Chain(list(reversed(bijectors[:-1])), name='2d_mlp_bijector')
dist = tfd.TransformedDistribution(distribution=base_dist, bijector=mlp_bijector)
相反,如果您一次在__init__
中创建变量,并且在每次网络调用时避免进行tf.get_variable
调用,那么您应该会看到很大的进步。
class NFModel(tf.keras.Model):
def __init__(self, *args, **kwargs):
super(NFModel, self).__init__(*args, **kwargs)
num_layers = 6
d, r = 2, 2
self.num_layers = num_layers
self.V = [tf.get_variable('V', [d, r], dtype=DTYPE) for _ in range(num_layers)]
self.shift = [tf.get_variable('shift', [d], dtype=DTYPE) for _ in range(num_layers)]
self.L = [tf.get_variable('L', [d * (d + 1) / 2], dtype=DTYPE) for _ in range(num_layers)]
self.alpha = [tf.get_variable('alpha', [], dtype=DTYPE) for _ in range(num_layers)]
def call(self, *args, **kwargs):
bijectors = []
for i in range(self.num_layers):
V = self.V[i]
shift = self.shift[i]
L = self.L[i]
bijectors.append(tfb.Affine(
scale_tril=tfd.fill_triangular(L),
scale_perturb_factor=V,
shift=shift,
))
alpha = self.alpha[i]
abs_alpha = tf.abs(alpha) + .01
bijectors.append(LeakyReLU(alpha=abs_alpha))
base_dist = tfd.MultivariateNormalDiag(loc=tf.zeros([2], DTYPE))
mlp_bijector = tfb.Chain(list(reversed(bijectors[:-1])), name='2d_mlp_bijector')
dist = tfd.TransformedDistribution(distribution=base_dist, bijector=mlp_bijector)
return {"dist": dist}
可能还有其他这样的轻松优势,使用配置文件工具可以向正确的方向轻推。
另外,请注意,根据RFC
,TF 2.0不太关注“渴望执行”,而更多地关注与图形的交互方式。希望有帮助。