Question

我目前正在使用Tensorflow.js创建一个用于图像分类的Web应用程序。我意识到，当我尝试拟合/训练模型时，SGD Optimizer在对训练集进行预测时可以很好地完成工作。另一方面，即使在oneHot编码标签中有7个不同的类，由Adam优化器训练的模型也仅默认为一个类/类。

我已经尝试过多次降低学习率，但仍然返回相同的结果。有趣的是，即使损失对训练集的预测保持不变，损失仍会继续减少。为什么在这种情况下只能使用SGD Optimizer？

下面的代码已被编写为一个异步函数，该函数在按下“ Train Model”按钮时执行。产生这两个结果的代码唯一的不同是所使用的优化器（其他所有东西都相同）。

//COMPILATION: minimize cross entropy between labels and model predictions
  model.compile({ //learningRate is a const
    optimizer: tf.train.sgd(learningRate), //This is the source of the problem
    loss: 'categoricalCrossentropy'
  });

  //I have defined a separate oneHot fn using the built-in method in tfjs
  let oneHotLabels = await oneHot(labelsCreator()); //tensor of oneHot labels corresponding to the training set images
  let images = document.querySelectorAll(".image[style='visibility: visible;']"); //all images that the model is trained on
  let imageTensors;
  await loadImages(images).then(response => {
    imageTensors = response; //promise returns an input tensor to the model (using tf.browser.fromPixels)
  });

  const config = {
    epochs: numSteps, //previously defined, number of steps the model should take
    shuffle: true,
    callbacks: {
      onEpochEnd: (num, logs) => {
        console.log('Step ' + num);
        model.predict(imageTensors).print();
      }
    }
  };
  model.fit(imageTensors, oneHotLabels, config);
};

结果

Actual Labels:
[[1, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 1, 0, 0],
[0, 1, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 1, 0, 0],
[0, 0, 0, 0, 1, 0, 0],
[0, 0, 0, 0, 0, 0, 1],
[0, 0, 0, 1, 0, 0, 0],
[0, 0, 0, 1, 0, 0, 0]]

SGD Prediction (good accuracy):
[[0.7766488, 0.0011008, 0   , 0.1785417, 0.0128197, 0.0004938, 0.0303952],
     [0.0000667, 0.0000794, 0   , 0.0006806, 0.9991421, 0.0000079, 0.0000238],
     [0.0014632, 0.5784221, 0   , 0.1196582, 0.2930434, 0.0010978, 0.0063155],
     [0.0000067, 0.000024 , 0   , 0.0015023, 0.9984038, 0.0000043, 0.0000589],
     [0.0000328, 0.0000816, 0   , 0.0027291, 0.9970328, 0.0000105, 0.0001136],
     [0.0000657, 0.0000679, 0   , 0.0176743, 0.002068 , 0.000019 , 0.9801047],
     [0.0055083, 0.0023474, 1e-7, 0.7483685, 0.215641 , 0.0009718, 0.027163 ],
     [0.0045981, 0.0015817, 0   , 0.922437 , 0.026959 , 0.0002184, 0.044206 ]]

Adam Prediction (at every step since step 1):
[[0, 0, 0, 0, 1, 0, 0],
[0, 0, 0, 0, 1, 0, 0],
[0, 0, 0, 0, 1, 0, 0],
[0, 0, 0, 0, 1, 0, 0],
[0, 0, 0, 0, 1, 0, 0],
[0, 0, 0, 0, 1, 0, 0],
[0, 0, 0, 0, 1, 0, 0],
[0, 0, 0, 0, 1, 0, 0]]

Answer 1

请详细说明您的问题和网络。也许SGD是训练它的更好方法，当然，也许您没有使用适当的参数值来与Adam一起训练网络。首先尝试降低学习率。

祝你好运！

Adam Optimizer每次仅返回一个类

1 个答案: