Question

我已经阅读了tensorflowjs指南以识别节点的手写数字，最后我得到了一个包含两个文件的文件夹：model.json，weights.bin。现在，我想使用此模型来识别图像上的数字。

const tf = require('@tensorflow/tfjs-node');
const path = require('path');
const jimp = require('jimp');

async function loadModel() {
  const model = await tf.loadLayersModel(`file://${path.resolve('./model/model.json')}`); // load model
  jimp.read('img0.png').then(img => { // load image with white background and black handwritten number
    img.resize(28, 28).greyscale().invert(); // resize the image and make background black and the number itself white
    console.log(img.bitmap.data.length);
    const buffer = img.bitmap.data.reduce((acc, curr, idx) => { // removing gba bytes so we have only value of r, which is a number in range 0-255
      if (idx % 4 === 0) {
        acc.push(curr);
      }
      return acc;
    }, []);
    console.log(buffer.length); // now we have 28x28 bytes
    const imageShape = [buffer.length, 28, 28, 1]; // I have no idea
    const image = new Float32Array(tf.util.sizeFromShape(imageShape)); // what
    image.set(buffer); // I'm 
    const prediction = model.execute(tf.tensor4d(image, imageShape)); // doing 
    console.log(prediction); // here
  });
}
loadModel();

因此，我有一个784个字节的缓冲区，对应于一个图像的784个像素值，我想以单个数字的形式进行预测，但是我不知道该怎么做。

更新：我用predict代替了execute，然后打电话给print()，结果给了我！

Answer 1

要使用模型进行推理，张量必须为单个图像的形状[28,28,1]。但是由于模型预测需要一批图像，因此将其传递给预测函数的张量应为[b，28,28,1]形状，其中b是要预测其数量的图像数。还可以考虑使用predict而不是execute。

这里是一个变化：

const image = new Float32Array(28*28*1);
image.set(buffer); 
const prediction = model.predict(tf.tensor4d(image, [1, 28, 28, 1]))

如何使用已经讲授的模型实际识别数字？

1 个答案: