我正在使用AdaDelta作为神经网络的优化器(XOR问题,1个隐藏层,TanH作为激活函数,输出层激活函数为S型)
INPUT_DIM = 2
OUTPUT_DIM = 1
input_ = C.input_variable(INPUT_DIM, 'float32')
output_ = C.input_variable(OUTPUT_DIM, 'float32')
with C.layers.default_options(activation=C.tanh, init=C.glorot_uniform()):
model = C.layers.Sequential([
C.layers.Dense(4),
C.layers.Dense(OUTPUT_DIM, activation=C.sigmoid)
])
model = model(input_)
C#神经网络的配置与此类似。 Python和C#对于AdaDelta具有不同的参数。
在Python中就这么简单,大多数参数都具有默认值
learner = C.adadelta(model.parameters)
在C#中,我根据previous link对其进行了显式设置,看起来像这样
double learningRate = 1;
var schedule = new TrainingParameterScheduleDouble(learningRate);
var options = new AdditionalLearningOptions()
{
l1RegularizationWeight = 0,
l2RegularizationWeight = 0,
gradientClippingThresholdPerSample = double.PositiveInfinity,
gradientClippingWithTruncation = true
};
var parameterVector = new ParameterVector(model.Parameters().ToArray());
IList<Learner> parameterLearners = new List<Learner>()
{
CNTKLib.AdaDeltaLearner(parameterVector,
schedule,
rho: 0.95,
epsilon: 1e-8,
additionalOptions: options)
};
Python版本不仅在此网络上具有更快的收敛速度。有什么方法可以在C#上实现相同的性能吗?