TESSDATA_PREFIX
环境变量设置为"tessdata"
目录的父目录。
加载语言失败' eng'
Tesseract无法加载任何语言!
我无法在路径下方设置TESSDATA_PREFIX
环境变量设置为"tessdata"
目录的父目录。
/Users/syzygy01/Library/Developer/CoreSimulator/Devices/3C2CC079-D784-432D-A79A-C5336017E69C/data/Containers/Bundle/Application/61ADADE0-8CFD-4815-8F33-19B0DA676619/TesstractTest.app/tessdata/eng.traineddata
答案 0 :(得分:1)
请保持将数据文件转移到tessdata文件夹中,并遵循以下代码:
import net.sourceforge.tess4j.Tesseract;
import net.sourceforge.tess4j.TesseractException;
public class OCR_POC {
public static void main(String[] args) throws TesseractException {
String inputFilePath ="F:/my_documents/issues.pdf";
Tesseract tesseract = new Tesseract();
tesseract.setDatapath("F:/Tesseract/tessdata/");
//tesseract.setLanguage("chi_sim");
//tesseract.setLanguage("eng");// english is default langauge
String fullText= tesseract.doOCR(new File (inputFilePath));
System.out.println("Full text : "+fullText);
}
}
pom.xml的Maven存储库:
<dependency>
<groupId>net.sourceforge.tess4j</groupId>
<artifactId>tess4j</artifactId>
<version>${tess4j}</version>
</dependency>