我试图为机器学习培养一些直觉。我查看了https://github.com/deeplearning4j/dl4j-0.4-examples的例子,我想开发自己的例子。基本上我只是采用了一个简单的函数:a * a + b * b + c * c - a * b * c + a + b + c并为随机a,b,c生成10000输出并尝试在90上训练我的网络输入的百分比。事情就是无论我做了什么,我的网络都无法预测其余的例子。
这是我的代码:
public class BasicFunctionNN {
private static Logger log = LoggerFactory.getLogger(MlPredict.class);
public static DataSetIterator generateFunctionDataSet() {
Collection<DataSet> list = new ArrayList<>();
for (int i = 0; i < 100000; i++) {
double a = Math.random();
double b = Math.random();
double c = Math.random();
double output = a * a + b * b + c * c - a * b * c + a + b + c;
INDArray in = Nd4j.create(new double[]{a, b, c});
INDArray out = Nd4j.create(new double[]{output});
list.add(new DataSet(in, out));
}
return new ListDataSetIterator(list, list.size());
}
public static void main(String[] args) throws Exception {
DataSetIterator iterator = generateFunctionDataSet();
Nd4j.MAX_SLICES_TO_PRINT = 10;
Nd4j.MAX_ELEMENTS_PER_SLICE = 10;
final int numInputs = 3;
int outputNum = 1;
int iterations = 100;
log.info("Build model....");
MultiLayerConfiguration conf = new NeuralNetConfiguration.Builder()
.iterations(iterations).weightInit(WeightInit.XAVIER).updater(Updater.SGD).dropOut(0.5)
.learningRate(.8).regularization(true)
.l1(1e-1).l2(2e-4)
.optimizationAlgo(OptimizationAlgorithm.LINE_GRADIENT_DESCENT)
.list(3)
.layer(0, new DenseLayer.Builder().nIn(numInputs).nOut(8)
.activation("identity")
.build())
.layer(1, new DenseLayer.Builder().nIn(8).nOut(8)
.activation("identity")
.build())
.layer(2, new OutputLayer.Builder(LossFunctions.LossFunction.RMSE_XENT)//LossFunctions.LossFunction.RMSE_XENT)
.activation("identity")
.weightInit(WeightInit.XAVIER)
.nIn(8).nOut(outputNum).build())
.backprop(true).pretrain(false)
.build();
//run the model
MultiLayerNetwork model = new MultiLayerNetwork(conf);
model.init();
model.setListeners(Collections.singletonList((IterationListener) new ScoreIterationListener(iterations)));
//get the dataset using the record reader. The datasetiterator handles vectorization
DataSet next = iterator.next();
SplitTestAndTrain testAndTrain = next.splitTestAndTrain(0.9);
System.out.println(testAndTrain.getTrain());
model.fit(testAndTrain.getTrain());
//evaluate the model
Evaluation eval = new Evaluation(10);
DataSet test = testAndTrain.getTest();
INDArray output = model.output(test.getFeatureMatrix());
eval.eval(test.getLabels(), output);
log.info(">>>>>>>>>>>>>>");
log.info(eval.stats());
}
}
我也玩过学习率,并且很多时候得分没有提高:
10:48:51.404 [main] DEBUG o.d.o.solvers.BackTrackLineSearch - Exited line search after maxIterations termination condition; score did not improve (bestScore=0.8522868127536543, scoreAtStart=0.8522868127536543). Resetting parameters
作为激活功能,我也试过了
答案 0 :(得分:1)
一个明显的问题是你试图用线性模型模拟非线性函数。您的神经网络没有激活函数,因此它只能表达W1a + W2b + W3c + W4
形式的函数。无论你创建了多少个隐藏单位 - 只要没有使用非线性激活函数,你的网络就会退化为简单的线性模型。
还有许多“小怪物”,包括但不限于: