我对tensorflow还是很陌生,目前正在使用和编辑tutorial for recognizing handwritten digits with CNNs。它可以工作,但是model.predict
花费的时间比预期的要长得多。我这方面可能存在一些基本误会。
相关部分是我添加的代码,该代码可以预测所有65000个样本,将结果与标签进行比较,并输出出现错误的图像,而不仅仅是计数:
import * as tf from '@tensorflow/tfjs';
import * as tfvis from '@tensorflow/tfjs-vis';
export async function showMistakes([imageData, labelData]: [Float32Array, Uint8Array], model: tf.Sequential) {
let start = performance.now();
let [predictionsTensor, labelsTensor] = tf.tidy(() => {
console.log(`(A) Time taken: ${performance.now() - start}ms`);
// imageData = imageData.slice(0, 28 * 28);
// labelData = labelData.slice(0, 10);
let input = tf.tensor4d(imageData, [imageData.length / (28 * 28), 28, 28, 1]);
console.log(`(B) Time taken: ${performance.now() - start}ms`);
// let dummy = input.arraySync();
// console.log(`(B2) Time taken: ${performance.now() - start}ms`);
// let dummy2 = input.arraySync();
// console.log(`(B3) Time taken: ${performance.now() - start}ms`);
let predictionsFullTensor = (model.predict(input) as tf.Tensor2D);
console.log(`(C) Time taken: ${performance.now() - start}ms`);
let predictionsTensor = predictionsFullTensor.argMax<tf.Tensor1D>(-1);
console.log(`(D) Time taken: ${performance.now() - start}ms`);
let labelsTensor = tf.tensor2d(labelData, [labelData.length / 10, 10]).argMax<tf.Tensor1D>(-1);
console.log(`(E) Time taken: ${performance.now() - start}ms`);
console.log(tf.memory());
return [predictionsTensor, labelsTensor];
});
console.log(`(F) Time taken: ${performance.now() - start}ms`);
console.log(tf.memory());
let [predictions, labels] = [await predictionsTensor.array(), await labelsTensor.array()];
console.log(`(G) Time taken: ${performance.now() - start}ms`);
predictionsTensor.dispose();
labelsTensor.dispose();
console.log(tf.memory());
let tempCanvas = document.createElement("canvas");
tempCanvas.width = 28;
tempCanvas.height = 28;
const MAX_FAILS = 384;
let fails = predictions
.map((prediction, i) => [prediction, labels[i], i] as const)
.filter(([prediction], i) => prediction !== labels[i])
.slice(0, MAX_FAILS)
.map(([prediction, label, i]) => {
let canvas = document.createElement('canvas');
const SCALE = 2;
const IMAGE_SIZE = 28;
canvas.width = IMAGE_SIZE * SCALE;
canvas.height = IMAGE_SIZE * SCALE;
tempCanvas.getContext("2d")?.putImageData(
new ImageData(
Uint8ClampedArray.from(
{ length: 28 * 28 * 4 },
(_, j) => j % 4 === 3 ? 255 : Math.round(imageData[i * 28 * 28 + Math.floor(j / 4)] * 255)
),
28
),
0,
0
);
canvas.getContext("2d")?.drawImage(tempCanvas, 0, 0, IMAGE_SIZE * SCALE, IMAGE_SIZE * SCALE);
return [prediction, label, i, canvas] as const;
});
const surface = tfvis.visor().surface({ name: 'False predictions', tab: 'Mistakes'});
let previousContainer = surface.drawArea.querySelector("#falsePredictionsContainer");
if (previousContainer !== null) surface.drawArea.removeChild(previousContainer);
let container = document.createElement("div");
container.id = "falsePredictionsContainer";
for (let [prediction, label, i, canvas] of fails) {
let node = document.createElement("div");
node.className = "falsePrediction";
node.textContent = `#${i}: predicted ${prediction}, is labeled as ${label}`;
node.appendChild(canvas);
container.appendChild(node);
}
surface.drawArea.appendChild(container);
}
调用由showMistakes([data.datasetImages, data.datasetLabels], model);
完成,其中data
是本教程中的MNistData。
它可以工作,但是model.predict
花费了不切实际的时间(在最简单的模型上,所有65000个样本大约需要16秒,仅两个16neuron致密层,没有转换)。虽然我可能无法做出有根据的猜测,但在我看来,主要的时间因素是cpu / gpu之间的某些数据转换和交换。
对于前面提到的非常简单的模型,它根本不需要时间,我可以看到model.predict
所花费的时间几乎与对张量的.arraySync()
调用所花费的时间一样长(也许只是通过机会,因为我无法执行太多测试,但是无论如何我都会提到这一点。我添加了无用的(请参见代码中的注释).arraySync()
仅用于测试,因为我怀疑我的数据转换有问题,而且我会观察直到输入张量准备就绪的时间,而不是实际的时间预测需要。
请注意,我要一次预测所有数据,输入张量的形状为[65000,28,28,1]。对于内存使用,我可以观察到以下内容:在gpu上,使用tf.memory(),我可以看到numBytesInGPU: 209703140
,它看起来非常逼真,它有65000 * 28 * 28 * 4个字节以及较小的开销(跳回到numBytesInGPU: 114896
之后,看起来还可以)。
该代码是由webpack捆绑的,因此是tfjs npm模块的导入。我正在PC上运行此程序,这些天这些天宁可称为烤面包机,但tf.backend()仍显示“ webgl”,因此它不仅限于cpu或其他内容
PS:对于精度远远高于99%的CNN,“错误”通常是非常有趣的-tbh,很多时候是贴错标签的,或者仅仅是MNIST数据中的噪音。以下显然是三个和五个!