我正在尝试为我的Keras图层添加贝叶斯权重不确定性。经过一些编码后,我意识到Bayesifier可以作为任何类型层的包装器实现,但我面临一个问题。
贝叶斯体重不确定性的入门读物:
考虑具有m
可学习参数的ANN层。该图层的贝叶斯版本在权重上具有高斯分布,将存储另一组m
参数。第一个m
将被视为均值,新的m
集合将被视为权重的差异,因此每个权重都以正态分布为特征。
在学习阶段,我们从此分布中抽取权重。使用重新参数化技巧,我们仍然可以进行反向传播。
如您所见,将图层转换为贝叶斯图层并不依赖于图层的体系结构。我们只是重复权重。
所以我构建了一个类似的Keras包装器类:
from keras.layers.wrappers import Wrapper
from keras import backend as K
K.set_learning_phase(1)
class Bayesify(Wrapper):
def __init__(self, base_layer,
variational_initializer="ones",
variational_regularizer=None,
variational_constraint=None,
**kwargs):
super().__init__(base_layer, **kwargs)
self.variational_initializer = variational_initializer
self.variational_regularizer = variational_regularizer
self.variational_constraint = variational_constraint
self.mean = []
self.variation = []
def build(self, input_shape=None):
super().build(input_shape)
if not self.layer.built:
self.layer.build(input_shape)
self.layer.built = True
self.mean = self.layer.trainable_weights[:]
for tensor in self.mean:
self.variation.append(self.add_weight(
name="variation",
shape=tensor.shape,
initializer=self.variational_initializer,
regularizer=self.variational_regularizer,
constraint=self.variational_constraint
))
self._trainable_weights = self.mean + self.variation
def _sample_weights(self):
return [mean + K.log(1. + K.exp(log_stddev))*K.random_normal(shape=mean.shape)
for mean, log_stddev in zip(self.mean, self.variation)]
def call(self, inputs, **kwargs):
self.layer._trainable_weights = K.in_train_phase(self._sample_weights(), self.mean)
return self.layer.call(inputs, **kwargs)
在MNIST上运行贝叶斯密集层的简短脚本:
import numpy as np
from keras.models import Model
from keras.layers import Dense, Input
from keras.datasets import mnist
from bayesify import Bayesify
(lX, lY), (tX, tY) = mnist.load_data()
lX, tX = lX / 255., tX / 255.
onehot = np.eye(10)
lY, tY = onehot[lY], onehot[tY]
inputs = Input(lX.shape[1:])
x = Bayesify(Dense(30, activation="tanh"))(inputs)
x = Dense(lY.shape[1], activation="softmax")(x)
ann = Model(inputs=inputs, outputs=x)
ann.compile(optimizer="adam", loss="categorical_crossentropy")
ann.fit(lX, lY, batch_size=64, epochs=10, validation_data=(tX, tY))
学习脚本引发了奇怪的内部Keras异常,我现在想弄清楚。
此外,我不完全确定包装器的重量的渐变将被正确处理。
我目前正在使用layer.trainable_weights
属性的setter将采样的权重推送到包装层中,只需将call()
转发到layer.call()
。
一个这样的错误:
File "/data/Prog/PyCharm/bayesforge/xp_mnist.py", line 20, in <module>
ann.fit(lX, lY, batch_size=64, epochs=10, validation_data=(tX, tY))
File "/usr/lib/python3.6/site-packages/keras/engine/training.py", line
1630, in fit
batch_size=batch_size)
File "/usr/lib/python3.6/site-packages/keras/engine/training.py", line
1480, in _standardize_user_data
exception_prefix='target')
File "/usr/lib/python3.6/site-packages/keras/engine/training.py", line
113, in _standardize_input_data
'with shape ' + str(data_shape))
ValueError: Error when checking target: expected dense_2 to have 3
dimensions, but got array with shape (60000, 10)
一个小小的注释,我只用tensorflow后端测试了我的脚本。
有没有人对如何完成这种包装有任何建议?