对不起,很抱歉,这是我第一次尝试使用tensorflow。对于第k个输入样本和学习率
,我正在尝试实现由给出的Hebian学习规则,
其中使用张量流。
经过一番搜索,我发现这个code实现了渐变更新规则的变体。在此代码中,更新规则不依赖于输入数据。您能否提供一些有关如何调整此代码(可能为
_apply_dense
和_create_slots
)以实现上述学习规则的提示?
谢谢。
答案 0 :(得分:1)
让我们举个例子。假设x
的尺寸为(None, 2)
,而您要喂入批量大小4,那么x
的尺寸将为(4, 2)
。我们还假设权重w
的形状为(2, 2)
。
y=W^Tx
乘以x
,然后才能对其进行转置:y = tf.matmul(tf.transpose(w), tf.transpose(x))
。这将导致形状为(2, 2)x(2, 4)-->(2, 4)
。xy^T
乘以。在这里,我们还将转置x
:xyT = tf.matmul(tf.tranpose(x), tf.transpose(y))
,将得到形状(2, 4)x(4, 2)--> (2, 2)
。WW^T
,其形状为(2, 2)
。alpha*(I-WW^T)
,它也将具有形状(2, 2)
。xy^T
。代码:
def hebian_update(x, alpha=0.01):
with x.graph.as_default():
weights = tf.trainable_variables()
# 1
y = [tf.matmul(tf.transpose(w), tf.transpose(x)) for w in weights] # y = W^Tx
# 2
xyT = [tf.matmul(tf.transpose(x), tf.transpose(w)) for w in y] # xy^T
# 3
wwT = [tf.matmul(w, tf.transpose(w)) for w in weights] # WW^T
wwTshapes = [w.get_shape().as_list() for w in wwT] # shapes of WW^T
# 4
diffs = [alpha*(tf.eye(num_rows=s[0], num_columns=s[1]) - w)
for w, s in zip(wwT, wwTshapes)] # alpha*(I-WW^T)
# 5
diffs = [tf.matmul(d, w) for d, w in zip(diffs, xyT)] # alpha*(I-WW^T)xy^T
# 6
update_ops = [tf.assign(w, w + d) for w, d in zip(weights, diffs)]
return tf.group(update_ops)
让我们在斑点数据集上使用小型神经网络对其进行测试:
# dataset for illustration
from sklearn.datasets import make_blobs
x_train, y_train = make_blobs(n_samples=4,
n_features=2,
centers=[[1, 1], [-1, -1]],
cluster_std=0.5)
x = tf.placeholder(tf.float32, shape=[None, 2])
y = tf.placeholder(tf.int32, shape=[None])
with tf.name_scope('network'):
fc1 = tf.layers.dense(x, units=2, use_bias=False)
logits = tf.layers.dense(fc1, units=2, use_bias=False)
hebian_op = hebian_update(x)
with tf.name_scope('loss'):
xentropy = tf.nn.sparse_softmax_cross_entropy_with_logits(labels=y, logits=logits)
loss_fn = tf.reduce_mean(xentropy)
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
print(loss_fn.eval({x:x_train, y:y_train})) # 0.14356796
_ = sess.run(hebian_op, feed_dict={x:x_train})
print(loss_fn.eval({x:x_train, y:y_train})) # 0.3619529
既然神经网络中的所有权重都与输入x
兼容(即执行y = tf.matmul(tf.transpose(w), tf.transpose(x))
),这是您的责任!