如何使用复数值权重反向传播

时间:2017-12-08 20:21:49

标签: tensorflow neural-network theano complex-numbers

我们目前正在尝试复制以下论文的结果:https://openreview.net/forum?id=H1S8UE-Rb

为此,我们需要在包含复数值权重的神经网络上运行反向传播。

当我们尝试这样做时(使用代码[0]),我们得到一个错误(在[1]处)。我们找不到任何训练包含复数值权重的神经网络的项目的源代码。

我们想知道我们是否需要自己实施论文的反向传播调整,或者这是否已经是某些神经网络库的一部分。如果需要在Tensorflow中实现,那么实现这一目标的正确步骤是什么?

[0]:

def define_neuron(x):
    """
    x is input tensor
    """

    x = tf.cast(x, tf.complex64)

    mnist_x = mnist_y = 28
    n = mnist_x * mnist_y
    c = 10
    m = 10  # m needs to be calculated

    with tf.name_scope("linear_combination"):
        complex_weight = weight_complex_variable([n,m])
        complex_bias = bias_complex_variable([m])
        h_1 = x @ complex_weight + complex_bias

    return h_1

def main(_):
    mnist = input_data.read_data_sets(
        FLAGS.data_dir,
        one_hot=True,
    )

    # `None` for the first dimension in this shape means that it is variable.
    x_shape = [None, 784]
    x = tf.placeholder(tf.float32, x_shape)
    y_ = tf.placeholder(tf.float32, [None, 10])

    yz = h_1 = define_neuron(x)

    y = tf.nn.softmax(tf.abs(yz))

    with tf.name_scope('loss'):
        cross_entropy = tf.nn.softmax_cross_entropy_with_logits(
            labels=y_,
            logits=y,
        )

    cross_entropy = tf.reduce_mean(cross_entropy)

    with tf.name_scope('adam_optimizer'):
        optimizer = tf.train.AdamOptimizer(1e-4)
        optimizer = tf.train.GradientDescentOptimizer(1e-4)
        train_step = optimizer.minimize(cross_entropy)

[1]:

Extracting /tmp/tensorflow/mnist/input_data/train-images-idx3-ubyte.gz
Extracting /tmp/tensorflow/mnist/input_data/train-labels-idx1-ubyte.gz
Extracting /tmp/tensorflow/mnist/input_data/t10k-images-idx3-ubyte.gz
Extracting /tmp/tensorflow/mnist/input_data/t10k-labels-idx1-ubyte.gz
Traceback (most recent call last):
  File "complex.py", line 156, in <module>
    tf.app.run(main=main, argv=[sys.argv[0]] + unparsed)
  File "/Users/kevin/wdev/learn_tensor/env/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 48, in run
    _sys.exit(main(_sys.argv[:1] + flags_passthrough))
  File "complex.py", line 58, in main
    train_step = optimizer.minimize(cross_entropy)
  File "/Users/kevin/wdev/learn_tensor/env/lib/python3.6/site-packages/tensorflow/python/training/optimizer.py", line 343, in minimize
    grad_loss=grad_loss)
  File "/Users/kevin/wdev/learn_tensor/env/lib/python3.6/site-packages/tensorflow/python/training/optimizer.py", line 419, in compute_gradients
    [v for g, v in grads_and_vars
  File "/Users/kevin/wdev/learn_tensor/env/lib/python3.6/site-packages/tensorflow/python/training/optimizer.py", line 547, in _assert_valid_dtypes
    dtype, t.name, [v for v in valid_dtypes]))
ValueError: Invalid type tf.complex64 for linear_combination/Variable:0, expected: [tf.float32, tf.float64, tf.float16].

2 个答案:

答案 0 :(得分:2)

我还尝试在tensorflow中实现类似的网络,并发现优化器不能使用复值张量进行反向传播。解决方法是为实部和虚部提供单独的实际张量。你将不得不写一个函数,它将获得网络的“复杂”输出的幅度,简单地说就是Re ^ 2 - Im ^ 2。此输出值将用于计算损失。

答案 1 :(得分:0)

使用优化器无法正常运行,这是issue的报道,我认为tf 2尚不支持它。但是,您可以手工制作,例如:

<EditText
     android:id="@+id/editText1"
     android:layout_width="match_parent"
     android:layout_height="wrap_content"
     android:ems="10"
     android:inputType="numberDecimal" />

此处的渐变按预期方式运行,并按需计算渐变。 Here是有关复杂变量的梯度计算的讨论。