预测比预期慢得多

时间:2020-05-16 13:44:36

标签: javascript typescript tensorflow tensorflow.js

我对tensorflow还是很陌生,目前正在使用和编辑tutorial for recognizing handwritten digits with CNNs。它可以工作,但是model.predict花费的时间比预期的要长得多。我这方面可能存在一些基本误会。

相关部分是我添加的代码,该代码可以预测所有65000个样本,将结果与标签进行比较,并输出出现错误的图像,而不仅仅是计数:

import * as tf from '@tensorflow/tfjs';
import * as tfvis from '@tensorflow/tfjs-vis';

export async function showMistakes([imageData, labelData]: [Float32Array, Uint8Array], model: tf.Sequential) {
  let start = performance.now();
  let [predictionsTensor, labelsTensor] = tf.tidy(() => {
    console.log(`(A) Time taken: ${performance.now() - start}ms`);
    // imageData = imageData.slice(0, 28 * 28);
    // labelData = labelData.slice(0, 10);
    let input = tf.tensor4d(imageData, [imageData.length / (28 * 28), 28, 28, 1]);
    console.log(`(B) Time taken: ${performance.now() - start}ms`);
    // let dummy = input.arraySync();
    // console.log(`(B2) Time taken: ${performance.now() - start}ms`);
    // let dummy2 = input.arraySync();
    // console.log(`(B3) Time taken: ${performance.now() - start}ms`);
    let predictionsFullTensor = (model.predict(input) as tf.Tensor2D);
    console.log(`(C) Time taken: ${performance.now() - start}ms`);
    let predictionsTensor = predictionsFullTensor.argMax<tf.Tensor1D>(-1);
    console.log(`(D) Time taken: ${performance.now() - start}ms`);
    let labelsTensor = tf.tensor2d(labelData, [labelData.length / 10, 10]).argMax<tf.Tensor1D>(-1);
    console.log(`(E) Time taken: ${performance.now() - start}ms`);
    console.log(tf.memory());
    return [predictionsTensor, labelsTensor];
  });

  console.log(`(F) Time taken: ${performance.now() - start}ms`);
  console.log(tf.memory());
  let [predictions, labels] = [await predictionsTensor.array(), await labelsTensor.array()];
  console.log(`(G) Time taken: ${performance.now() - start}ms`);
  predictionsTensor.dispose();
  labelsTensor.dispose();

  console.log(tf.memory());

  let tempCanvas = document.createElement("canvas");
  tempCanvas.width = 28;
  tempCanvas.height = 28;

  const MAX_FAILS = 384;
  let fails = predictions
    .map((prediction, i) => [prediction, labels[i], i] as const)
    .filter(([prediction], i) => prediction !== labels[i])
    .slice(0, MAX_FAILS)
    .map(([prediction, label, i]) => {
      let canvas = document.createElement('canvas');

      const SCALE = 2;
      const IMAGE_SIZE = 28;
      canvas.width = IMAGE_SIZE * SCALE;
      canvas.height = IMAGE_SIZE * SCALE;

      tempCanvas.getContext("2d")?.putImageData(
        new ImageData(
          Uint8ClampedArray.from(
            { length: 28 * 28 * 4 },
            (_, j) => j % 4 === 3 ? 255 : Math.round(imageData[i * 28 * 28 + Math.floor(j / 4)] * 255)
          ),
          28
        ),
        0,
        0
      );

      canvas.getContext("2d")?.drawImage(tempCanvas, 0, 0, IMAGE_SIZE * SCALE, IMAGE_SIZE * SCALE);
      return [prediction, label, i, canvas] as const;
    });

  const surface = tfvis.visor().surface({ name: 'False predictions', tab: 'Mistakes'});

  let previousContainer = surface.drawArea.querySelector("#falsePredictionsContainer");
  if (previousContainer !== null) surface.drawArea.removeChild(previousContainer);

  let container = document.createElement("div");
  container.id = "falsePredictionsContainer";
  for (let [prediction, label, i, canvas] of fails) {
    let node = document.createElement("div");
    node.className = "falsePrediction";
    node.textContent = `#${i}: predicted ${prediction}, is labeled as ${label}`;
    node.appendChild(canvas);
    container.appendChild(node);
  }
  surface.drawArea.appendChild(container);
}

调用由showMistakes([data.datasetImages, data.datasetLabels], model);完成,其中data是本教程中的MNistData。

它可以工作,但是model.predict花费了不切实际的时间(在最简单的模型上,所有65000个样本大约需要16秒,仅两个16neuron致密层,没有转换)。虽然我可能无法做出有根据的猜测,但在我看来,主要的时间因素是cpu / gpu之间的某些数据转换和交换。

对于前面提到的非常简单的模型,它根本不需要时间,我可以看到model.predict所花费的时间几乎与对张量的.arraySync()调用所花费的时间一样长(也许只是通过机会,因为我无法执行太多测试,但是无论如何我都会提到这一点。我添加了无用的(请参见代码中的注释).arraySync()仅用于测试,因为我怀疑我的数据转换有问题,而且我会观察直到输入张量准备就绪的时间,而不是实际的时间预测需要。

请注意,我要一次预测所有数据,输入张量的形状为[65000,28,28,1]。对于内存使用,我可以观察到以下内容:在gpu上,使用tf.memory(),我可以看到numBytesInGPU: 209703140,它看起来非常逼真,它有65000 * 28 * 28 * 4个字节以及较小的开销(跳回到numBytesInGPU: 114896之后,看起来还可以)。

该代码是由webpack捆绑的,因此是tfjs npm模块的导入。我正在PC上运行此程序,这些天这些天宁可称为烤面包机,但tf.backend()仍显示“ webgl”,因此它不仅限于cpu或其他内容


PS:对于精度远远高于99%的CNN,“错误”通常是非常有趣的-tbh,很多时候是贴错标签的,或者仅仅是MNIST数据中的噪音。以下显然是三个和五个!

totally a three totally a five

0 个答案:

没有答案