如何使用tesseract.js提高准确性

时间:2018-02-22 06:04:23

标签: javascript tesseract

我想识别来自the cat image的英文文本。这是我的源代码。输出文本如下。如何提高识别率,使字符串 Vou 将转换为字符串,字符串 advrce 将转换为字符串建议即可。 enter image description here enter image description here

<html>
<head>
<script src="https://cdn.rawgit.com/naptha/tesseract.js/0.2.0/dist/tesseract.js">
</script>
</head>
<body>
<input type="text" id="url" placeholder="Image URL" />
<input type="button" id="go_button" value="Run" />
<div id="ocr_results"> </div>
<div id="ocr_status"> </div>
<script>
function runOCR(url) {
    Tesseract.recognize(url,{lang:'eng'})
         .then(function(result) {
            document.getElementById("ocr_results")
                    .innerText = result.text;
         }).progress(function(result) {
            document.getElementById("ocr_status")
                    .innerText = result["status"] + " (" +
                        (result["progress"] * 100) + "%)";
        });
}
document.getElementById("go_button")
        .addEventListener("click", function(e) {
            var url = document.getElementById("url").value;
            runOCR(url);
        });
</script>
</body>
</html>

0 个答案:

没有答案