无法在2台计算机之间划分Tensorflow 2中的渐变磁带

时间:2020-02-07 16:18:22

标签: tensorflow tensorflow-gradient

TensorFlow版本2.0.0

我正在Tensorflow 2.0.0中训练2个层叠模型model1和model2,以便在2台不同的计算机中使用。目前,我正在同一台计算机上模拟它。如果我在相同的梯度带中训练模型,那么它可以完美地训练,将梯度计算为:

    import tensorflow as tf
    loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)

    with tf.GradientTape(persistent=True) as tape:
        features = model1(x_batch_train)
        logits = model2(features)
        loss_value = loss_fn(y_batch_train, logits)

    # Computing gradients for model 2
    grads2 = tape.gradient(loss_value, model2.trainable_variables)
    # Computing gradients to pass to model 1
    grads_pass = tape.gradient(loss_value, features)
    # Computing gradients for model 1
    grads1 = tape.gradient(features, model1.trainable_variables, output_gradients=grads_pass)

但是,如果我想将训练分为2台不同的计算机,则每台计算机上都应该有一个梯度带,但是计算出的结果是不同的!

    import tensorflow as tf
    loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)

    with tf.GradientTape() as tape1:
        features = model1(x_batch_train)

    with tf.GradientTape(persistent=True) as tape2:
        logits = model2(features)
        loss_value = loss_fn(y_batch_train, logits)

    # Computing gradients for model 2
    grads2 = tape2.gradient(loss_value, model2.trainable_variables)
    # Computing gradients to pass to model 1
    grads_pass = tape2.gradient(loss_value, features)
    # Computing gradients for model 1
    grads1 = tape1.gradient(features, model1.trainable_variables, output_gradients=grads_pass)

磁带上记录的操作是独立的,为什么会发生这种情况?

0 个答案:

没有答案