所以我正在研究(Hyperskills)的程序,该程序应该模拟机器学习的最基本形式:简单的数字识别。
我对程序的学习部分有疑问:我遇到了无限循环。我真的不知道我的停车条件应该是什么。
问题:
我们有15个神经元(a0,...,a15)作为10个输出神经元(o0,...,o9)的输入,每个输出神经元具有15个权重。总共15x10的权重。这10个输出神经元对应于我识别的数字。
每个输出的计算方式为:o0 = S(a0 w(a0,o0)+ ... + a15 w(a15,o0)+ b * w(b, o0))。 其中S()是S形函数,w(a,o)是权重。我们可以忽略程序中的偏见。
理想数字0例如为{1,1,1,1,0,1,1,0,1,1,0,1,1,1,1,1}
所有权重都初始化为0或使用随机的高斯数。
网络应自行找到理想的权重进行工作……
要学习,网络应该:
首先对于每个神经元'o',计算其输出S(o0)。假设在我们的示例中,我们试图找到仅适用于o0的理想权重。
使用增量规则。对于每次迭代,您应该找到当前迭代与下一个迭代之间的当前权重之差:Δw(ai,oj)=η∗ ai ∗(o'ideal'i-S(o0))。 输入数字的ai值在第i个位置经过测试-。理想输出的i'位置的i'理想值
因此,对于o0,我们得到10x15Δw:找到所有10个S(o0)(测试从0到9的理想输入)和每个S()的对应Δw
在这一点上,我们需要通过增加这些值的平均值Δw来调整电流输出的权重:这样就得到了平均值Δw'i=(全部Δw'ai)/ 10
练习说明提供了两个提示:
您应该重复此过程一段时间。据说权重的每次更新都是网络的新一代。最好的方法是重复直到平均Δw 意味着变得很小。
程序要花多长时间才能获得良好的成绩?您可以尝试10、100和1000代。通常,如果您的权重没有一代又一代地改变,或者只是一点点改变,您就可以停止学习。这意味着您达到了本地最低要求。
==>完成练习的某人的提示:理想的输出是一组数字,其中有一个单1,而所有其他都是零。 1是要测试的数字等于输出神经元的位置。
这是代码。 ---课堂网络> learning()
(只需按1即可开始学习过程。)
import java.util.Arrays;
import java.util.Random;
import java.util.Scanner;
class Network {
private double[][] idealNeuron; //ideal neurons : o0 to o9
private double[][] weights = new double[10][16]; //network weights, to be initialized
private double learningRateCoefficient = 0.5;
private double[][] idealOutputsResult;
Network(){
initialize();
learning();
}
public void initialize(){
idealNeuron = new double[][]{ //[10][16]
{1, 1, 1, 1, 0, 1, 1, 0, 1, 1, 0, 1, 1, 1, 1, 1}, //0
{0, 1, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 1}, //1
{1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1}, //2
{1, 1, 1, 0, 0, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1}, //3
{1, 0, 1, 1, 0, 1, 1, 1, 1, 0, 0, 1, 0, 0, 1, 1}, //4
{1, 1, 1, 1, 0, 0, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1}, //5
{1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1}, //6
{1, 1, 1, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 0, 1, 1}, //7
{1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1}, //8
{1, 1, 1, 1, 0, 1, 1, 1, 1, 0, 0, 1, 0, 0, 1, 1}, //9
};
/*Random r = new Random();
double value;
for(int i = 0; i < 10; i++){
for(int j = 0; j < 15; j++){
value = r.nextGaussian();
weights[i][j] = value;
}
}*/
weights = new double[][]{ //[10][16]
{0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0},
{0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0},
{0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0},
{0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0},
{0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0},
{0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0},
{0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0},
{0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0},
{0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0},
{0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0},
};
idealOutputsResult = new double[][]{ //[10][10]
{1, 0, 0, 0, 0, 0, 0, 0, 0, 0},
{0, 1, 0, 0, 0, 0, 0, 0, 0, 0},
{0, 0, 1, 0, 0, 0, 0, 0, 0, 0},
{0, 0, 0, 1, 0, 0, 0, 0, 0, 0},
{0, 0, 0, 0, 1, 0, 0, 0, 0, 0},
{0, 0, 0, 0, 0, 1, 0, 0, 0, 0},
{0, 0, 0, 0, 0, 0, 1, 0, 0, 0},
{0, 0, 0, 0, 0, 0, 0, 1, 0, 0},
{0, 0, 0, 0, 0, 0, 0, 0, 1, 0},
{0, 0, 0, 0, 0, 0, 0, 0, 0, 1},
};
}
public void learning(){
int fh =0;
double[] outputs = new double[10];
boolean isIdeal;
double[][] delta = new double[10][15]; //all deltaRules value for one neuron o
double[] dws= new double[15]; //deltaRules for each neuron from the input test
for(int o = 0; o < 10; o++){ // for each neuron o
for(int i = 0; i < 10; i++){ //calculate output o for each test from number 0 to 9
outputs[i]= getMyOutput(idealNeuron[i],weights[o]);
}
isIdeal = validate(outputs,o); //check if we find the rights weights, since with the ideal weights we get the idealOutputsResult
while(isIdeal == false){ //since it's not ideal, we need to calculate 10x16 deltaRules then make 10x16 weights adjustments
for(int j = 0; j < delta.length; j++){ //position input test by calculated outputs
for(int k = 0; k < 15; k++){ //calculate and store each 15(+1) deltaRules/test in delta[][]//15 BECAUSE THE BIAS ARE NOT TAKEN INTO ACCOUNT AT THIS POINT
dws[k] = deltaRule(idealNeuron[j][k],idealNeuron[o][k],outputs[j]);
}
delta[j] = dws;
}
//Adjust Weights
adjustWeights(delta,o);
//re-calculate output o for each test from number 0 to 9
for(int i = 0; i < 10; i++){
outputs[i]= getMyOutput(idealNeuron[i],weights[o]);
}
//Check if we need to stop for that particular output o
isIdeal = validate(outputs,o);
//System.out.println("MY WEIGHTS--> "+ Arrays.toString(outputs));
}
//now next neuron
System.out.println("NEXT NEURON");
}
System.out.println("10X15 WEIGHTS FOUNDED !! AYY!");
}
public boolean validate(double[] output, int positionOutputBeingTested){
boolean isIdeal = true;
for(int j = 0; j < 10; j++){
if(output[j] != idealOutputsResult[positionOutputBeingTested][j]){
isIdeal = false;
break;
}
}
return isIdeal;
}
public void adjustWeights(double[][] delta, int positionOutputBeingTested){
double mean = 0;
for(int j = 0; j < 15; j++){
for(int i = 0; i < 10; i++){
mean += delta[i][j];
}
mean = mean / 10;
weights[positionOutputBeingTested][j] += mean;
mean = 0;
}
}
public double deltaRule(double ai, double ojIdeal, double oj){
return learningRateCoefficient * ai * (ojIdeal - oj);
}
public double getMyOutput(double[] input, double[] weights){
double output = 0;
for(int i = 0; i < input.length; i++){
output += input[i] * weights[i];
}
output = 1 / (1 + Math.exp(-output));
return output;
}
public double[][] getWeights() {
return weights;
}
public void printweights(){
for(int i = 0 ; i < weights.length; i++){
System.out.println(Arrays.toString(weights[i]));
}
}
}
class Machine {
private double[][] weights;
private double[] input;
Machine(double[] input, double[][] weights){
this.input = input;
this.weights = weights;
}
public int interpret(){
double[] outputs = new double[10];
for(int i = 0; i < 10; i++){ //calculate all outputs o
outputs[i] = getMyOutput(input,weights[i]);
}
int result = 0;
double resultOutput = outputs[0];
for(int i = 0; i < 10; i++){
if(outputs[i] > resultOutput){
resultOutput = outputs[i];
result = i;
}
}
return result;
}
public double getMyOutput(double[] input, double[] weights){
double output = 0;
for(int i = 0; i < input.length; i++){
output += input[i] * weights[i];
}
output = 1 / (1 + Math.exp(-output));
return output;
}
}
class myInterface {
private Scanner scanner = new Scanner(System.in);
private boolean systemInitialized = false;
private Network network;
private Machine machine;
public void start(){
System.out.println("1. Learn the network");
System.out.println("2. Guess a number");
int choice = scanner.nextInt();
System.out.println("Your choice: "+ choice);
if(choice == 1){
if(systemInitialized){
System.out.println("System already initialized.");
start();
} else {
learn();
}
} else if (choice == 2){
inputs();
}
}
public void learn(){
systemInitialized = true;
//... method to call for network learning
System.out.println("Learning...");
network = new Network();
// ... method to call to save file - Serialization
System.out.println("Done! Saved to the file.");
start();
}
public void inputs(){
Scanner scanner = new Scanner(System.in);
String[][] inputS = new String[5][3];
String[] out = new String[5];
double[] input = new double[15];
String[] row;
for (int i = 0; i < 5; i++) {
String in = scanner.nextLine();
out[i] = in;
row = in.split("");
inputS[i] = row;
}
int count = 0;
for (int i = 0; i < 5; i++){
for (int j = 0; j < 3; j++){
if (inputS[i][j].equals("_")){
input[count] = 0;
} else if (inputS[i][j].toUpperCase().equals("X")){
input[count] = 1;
}
count++;
}
}
System.out.println("Input grid:");
for(int i = 0; i < 5; i++){
System.out.println(out[i]);
}
System.out.println("This number is "+new Machine(input, network.getWeights()).interpret());
}
}
public class Main {
public static void main(String[] args){
new myInterface().start();
}
}