Question

我试图为机器学习培养一些直觉。我查看了https://github.com/deeplearning4j/dl4j-0.4-examples的例子，我想开发自己的例子。基本上我只是采用了一个简单的函数：a * a + b * b + c * c - a * b * c + a + b + c并为随机a，b，c生成10000输出并尝试在90上训练我的网络输入的百分比。事情就是无论我做了什么，我的网络都无法预测其余的例子。

这是我的代码：

public class BasicFunctionNN {

    private static Logger log = LoggerFactory.getLogger(MlPredict.class);

    public static DataSetIterator generateFunctionDataSet() {
        Collection<DataSet> list = new ArrayList<>();
        for (int i = 0; i < 100000; i++) {
            double a = Math.random();
            double b = Math.random();
            double c = Math.random();

            double output = a * a + b * b + c * c - a * b * c + a + b + c;
            INDArray in = Nd4j.create(new double[]{a, b, c});
            INDArray out = Nd4j.create(new double[]{output});
            list.add(new DataSet(in, out));
        }
        return new ListDataSetIterator(list, list.size());
    }

    public static void main(String[] args) throws Exception {
        DataSetIterator iterator = generateFunctionDataSet();

        Nd4j.MAX_SLICES_TO_PRINT = 10;
        Nd4j.MAX_ELEMENTS_PER_SLICE = 10;

        final int numInputs = 3;
        int outputNum = 1;
        int iterations = 100;

        log.info("Build model....");
        MultiLayerConfiguration conf = new NeuralNetConfiguration.Builder()
                .iterations(iterations).weightInit(WeightInit.XAVIER).updater(Updater.SGD).dropOut(0.5)
                .learningRate(.8).regularization(true)
                .l1(1e-1).l2(2e-4)
                .optimizationAlgo(OptimizationAlgorithm.LINE_GRADIENT_DESCENT)
                .list(3)
                .layer(0, new DenseLayer.Builder().nIn(numInputs).nOut(8)
                        .activation("identity")
                        .build())
                .layer(1, new DenseLayer.Builder().nIn(8).nOut(8)
                        .activation("identity")
                        .build())
                .layer(2, new OutputLayer.Builder(LossFunctions.LossFunction.RMSE_XENT)//LossFunctions.LossFunction.RMSE_XENT)
                        .activation("identity")
                        .weightInit(WeightInit.XAVIER)
                        .nIn(8).nOut(outputNum).build())
                .backprop(true).pretrain(false)
                .build();


        //run the model
        MultiLayerNetwork model = new MultiLayerNetwork(conf);
        model.init();
        model.setListeners(Collections.singletonList((IterationListener) new ScoreIterationListener(iterations)));

        //get the dataset using the record reader. The datasetiterator handles vectorization
        DataSet next = iterator.next();
        SplitTestAndTrain testAndTrain = next.splitTestAndTrain(0.9);
        System.out.println(testAndTrain.getTrain());

        model.fit(testAndTrain.getTrain());

        //evaluate the model
        Evaluation eval = new Evaluation(10);
        DataSet test = testAndTrain.getTest();
        INDArray output = model.output(test.getFeatureMatrix());
        eval.eval(test.getLabels(), output);
        log.info(">>>>>>>>>>>>>>");
        log.info(eval.stats());

    }
}

我也玩过学习率，并且很多时候得分没有提高：

10:48:51.404 [main] DEBUG o.d.o.solvers.BackTrackLineSearch - Exited line search after maxIterations termination condition; score did not improve (bestScore=0.8522868127536543, scoreAtStart=0.8522868127536543). Resetting parameters

作为激活功能，我也试过了

Answer 1

一个明显的问题是你试图用线性模型模拟非线性函数。您的神经网络没有激活函数，因此它只能表达W1a + W2b + W3c + W4形式的函数。无论你创建了多少个隐藏单位 - 只要没有使用非线性激活函数，你的网络就会退化为简单的线性模型。

更新

还有许多“小怪物”，包括但不限于：

你正在使用巨大的学习率（0.8）
你正在使用大量的正则化（非常复杂，使用l1和l2正则化器进行回归不是常见的方法，特别是在神经网络中）一个你不需要的问题
整流器单元可能不是表达平方操作的最佳单元，也可能不是您正在寻找的乘法。整流器非常适合分类，特别是对于更深层次的架构，但不适用于浅层回归。尝试使用sigmoid-like（tanh，sigmoid）激活。
我不完全确定“迭代”在此实现中意味着什么，但通常这是用于训练的样本/小批量的数量。因此，仅使用100可能对于梯度下降学习而言数量级太小

MultiLayerNetwork可以预测简单的功能

1 个答案:

更新