Question

训练网络时似乎没有问题，因为它收敛并降至0.01错误以下。但是，当我加载训练有素的网络并引入评估集时，它将为所有评估集行输出相同的结果（实际预测，而不是训练阶段）。我使用9个输入，1个具有7个隐藏神经元的隐藏层和1个输出神经元的弹性传播来训练我的网络。更新：我的数据使用最小-最大规范化。我正在尝试预测电力负荷数据。

这里是样本数据，前9行是输入，第10行是理想值：

0.5386671932975533, 1100000.0, 0.0, 1.0, 40.0, 1.0, 30.0, 9.0, 2014.0 , 0.5260616667545941
0.5260616667545941, 1100000.0, 0.0, 1.0, 40.0, 2.0, 30.0, 9.0, 2014.0, 0.5196499668339777
0.5196499668339777, 1100000.0, 0.0, 1.0, 40.0, 3.0, 30.0, 9.0, 2014.0, 0.5083828048375548
0.5083828048375548, 1100000.0, 0.0, 1.0, 40.0, 4.0, 30.0, 9.0, 2014.0, 0.49985462144799725
0.49985462144799725, 1100000.0, 0.0, 1.0, 40.0, 5.0, 30.0, 9.0, 2014.0, 0.49085956670499675
0.49085956670499675, 1100000.0, 0.0, 1.0, 40.0, 6.0, 30.0, 9.0, 2014.0, 0.485008112408512

这是完整的代码：

public class ANN
{   
//training
//public final static String SQL = "SELECT load_input, day_of_week, weekend_day, type_of_day, week_num, time, day_date, month, year, ideal_value FROM sample WHERE (year,month,day_date,time) between (2012,4,1,1) and (2014,9,29, 96) ORDER BY ID";
//testing
public final static String SQL = "SELECT load_input, day_of_week, weekend_day, type_of_day, week_num, time, day_date, month, year, ideal_value FROM sample WHERE (year,month,day_date,time) between (2014,9,30,1) and (2014,9,30, 92) ORDER BY ID";
//validation
//public final static String SQL = "SELECT load_input, day_of_week, weekend_day, type_of_day, week_num, time, day_date, month, year, ideal_value FROM sample WHERE (year,month,day_date,time) between (2014,9,30,93) and (2014,9,30, 96) ORDER BY ID";
public final static int INPUT_SIZE = 9;
public final static int IDEAL_SIZE = 1;
public final static String SQL_DRIVER = "org.postgresql.Driver";
public final static String SQL_URL = "jdbc:postgresql://localhost/ANN";
public final static String SQL_UID = "postgres";
public final static String SQL_PWD = "";

public static void main(String args[])
{   
    Mynetwork();
    //train network. will add customizable params later.
    //train(trainingData());
    //evaluate network
    evaluate(trainingData());
    Encog.getInstance().shutdown();
}
public static void evaluate(MLDataSet testSet)
{
    BasicNetwork network = (BasicNetwork)EncogDirectoryPersistence.loadObject(new File("directory"));

    // test the neural network
    System.out.println("Neural Network Results:");
    for(MLDataPair pair: testSet ) {
        final MLData output = network.compute(pair.getInput());
        System.out.println(pair.getInput().getData(0) + "," + pair.getInput().getData(1) + "," + pair.getInput().getData(2) + "," + pair.getInput().getData(3) + "," + pair.getInput().getData(4) + "," + pair.getInput().getData(5) + "," + pair.getInput().getData(6) + "," + pair.getInput().getData(7) + "," + pair.getInput().getData(8) + "," + "Predicted=" + output.getData(0) + ", Actual=" + pair.getIdeal().getData(0));
    }
}
public static BasicNetwork Mynetwork()
{
    //basic neural network template. Inputs should'nt have activation functions
    //because it affects data coming from the previous layer and there is no previous layer before the input.
    BasicNetwork network = new BasicNetwork();
    //input layer with 2 neurons.
    //The 'true' parameter means that it should have a bias neuron. Bias neuron affects the next layer.
    network.addLayer(new BasicLayer(null , true, 9));
    //hidden layer with 3 neurons
    network.addLayer(new BasicLayer(new ActivationSigmoid(), true, 5));
    //output layer with 1 neuron
    network.addLayer(new BasicLayer(new ActivationSigmoid(), false, 1));
    network.getStructure().finalizeStructure() ;
    network.reset();

    return network;
}
public static void train(MLDataSet trainingSet)
{
    //Backpropagation(network, dataset, learning rate, momentum)
    //final Backpropagation train = new Backpropagation(Mynetwork(), trainingSet, 0.1, 0.9);
    final ResilientPropagation train = new ResilientPropagation(Mynetwork(), trainingSet);
    //final QuickPropagation train = new QuickPropagation(Mynetwork(), trainingSet, 0.9);

    int epoch = 1;

    do {
        train.iteration();
        System.out.println("Epoch #" + epoch + " Error:" + train.getError());
        epoch++;
    } while((train.getError() > 0.01)); 
    System.out.println("Saving network");
    System.out.println("Saving Done");
    EncogDirectoryPersistence.saveObject(new File("directory"), Mynetwork());
}
public static MLDataSet trainingData()
{
    MLDataSet trainingSet = new SQLNeuralDataSet(
            ANN.SQL,
            ANN.INPUT_SIZE,
            ANN.IDEAL_SIZE,
            ANN.SQL_DRIVER,
            ANN.SQL_URL,
            ANN.SQL_UID,
            ANN.SQL_PWD);

    return trainingSet;
}

}

这是我的结果：

Predicted=0.4451817588640455, Actual=0.5260616667545941
Predicted=0.4451817588640455, Actual=0.5196499668339777
Predicted=0.4451817588640455, Actual=0.5083828048375548
Predicted=0.4451817588640455, Actual=0.49985462144799725
Predicted=0.4451817588640455, Actual=0.49085956670499675
Predicted=0.4451817588640455, Actual=0.485008112408512
Predicted=0.4451817588640455, Actual=0.47800504210686795
Predicted=0.4451817588640455, Actual=0.4693212349328293
(...and so on with the same "predicted")

期望的结果（出于演示目的，我将“预测的”更改为随机值，表明网络实际上在预测）：

Predicted=0.4451817588640455, Actual=0.5260616667545941
Predicted=0.5123312331212122, Actual=0.5196499668339777
Predicted=0.435234234234254365, Actual=0.5083828048375548
Predicted=0.673424556563455, Actual=0.49985462144799725
Predicted=0.2344673345345544235, Actual=0.49085956670499675
Predicted=0.123346457544324, Actual=0.485008112408512
Predicted=0.5673452342342342, Actual=0.47800504210686795
Predicted=0.678435234423423423, Actual=0.4693212349328293

Answer 1

使用神经网络获得怪异结果时要考虑的第一个原因是归一化。您的数据必须规范化，否则，是的，训练将导致偏斜的NN始终产生相同的结果，这是常见症状。

在将数据输入神经网络之前，始终对其进行标准化。这很重要，因为如果考虑到S型激活函数，则它对于较大的值（正值和负值）基本上是平坦的，从而导致神经网络的行为恒定。尝试像这样input = (input-median(input)) / std(input)

进行规范化

经过训练的神经网络为所有评估行输出相同的结果

1 个答案: