Question

有没有一种方法可以按块对Google文档文本检测API的文本响应进行分组？如果有提供的解决方案，我可能在文档中忽略了它。我目前正在使用node.js从用户提供的图像中获取文本。这是我的代码：

const vision = require('@google-cloud/vision');
const client = new vision.ImageAnnotatorClient({
  keyFilename: 'APIKey.json'
});
client
  .documentTextDetection('image.jpg')
  .then(results => {
    res.send(results);
  })
  .catch(err => {
    res.send(err);
  });

谢谢。

Answer 1

我不确定是否有标准化的方法来执行此操作，但是Vision API确实为我们提供了编写块文本所需的一切，包括相关的中断（请参见Vision API break Types）。因此，我们可以枚举每个块并从中创建文本。

我不考虑其他几种中断类型（HYPHEN，SURE_SPACE），但是我认为添加这些中断类型应该很容易。

例如：

const vision = require('@google-cloud/vision');
const client = new vision.ImageAnnotatorClient({
    keyFilename: 'APIKey.json'
});

client
.documentTextDetection('image.jpg')
.then(results => {
    console.log("Text blocks: ", getTextBlocks(results));
})
.catch(err => {
    console.error("An error occurred: ", err);
});

function getTextBlocks(visionResults) {
    let textBlocks = [];
    let blockIndex = 0;;
    visionResults.forEach(result => {
        result.fullTextAnnotation.pages.forEach(page => {
            textBlocks = textBlocks.concat(page.blocks.map(block => { return { blockIndex: blockIndex++, text: getBlockText(block) }}));
        });
    });
    return textBlocks;
}

function getBlockText(block) {
    let result = '';
    block.paragraphs.forEach(paragraph => {
        paragraph.words.forEach(word => {
            word.symbols.forEach(symbol => {
                result += symbol.text;
                if (symbol.property && symbol.property.detectedBreak) {
                    const breakType = symbol.property.detectedBreak.type;
                    if (['EOL_SURE_SPACE' ,'SPACE'].includes(breakType)) {
                        result += " ";
                    }
                    if (['EOL_SURE_SPACE' ,'LINE_BREAK'].includes(breakType)) {
                        result += "\n"; // Perhaps use os.EOL for correctness.
                    }
                }
            })
        })
    })

    return result;
}

Google Vision API文本检测按块显示单词

1 个答案: