我正在尝试使用tess-two API识别android中的随机字符。
我有一张带有字符串的打印纸页:“5XqaLB”
当我向相机的部分显示字符串以识别它时,我得到以下示例:
baseApi.setVariable("load_system_dawg", "0");
baseApi.setVariable("load_freq_dawg", "0");
baseApi.setVariable("load_punc_dawg", "0");
baseApi.setVariable("load_number_dawg", "0");
baseApi.setVariable("load_unambig_dawg", "0");
baseApi.setVariable("load_bigram_dawg", "0");
baseApi.setVariable("load_fixed_length_dawgs", "0");
baseApi.setVariable("segment_penalty_garbage", "0");
baseApi.setVariable("segment_penalty_dict_nonword", "0");
baseApi.setVariable("segment_penalty_dict_frequent_word", "0");
baseApi.setVariable("segment_penalty_dict_case_ok", "0");
baseApi.setVariable("segment_penalty_dict_case_bad", "0");
我认为这是因为tesseract试图用识别的字符猜测一个单词。我搜索了很多但找不到解决方案。 任何人都有想法避免这种替代品吗?
已经尝试过白名单,黑名单和confs:
build.gradle
任何人都可以猜测如何让tesseract只识别普通字符吗?
答案 0 :(得分:-1)
我设法解决了我遇到的类似问题。在我的情况下,我正在识别板块字符。我没有在整个平板图像中使用tesseract,而是进行了分离字符的预处理,因此我可以分别对每个字符使用tesseract。我的配置varibles:
final TessBaseAPI baseApi = new TessBaseAPI();
baseApi.init(TESSBASE_PATH, DEFAULT_DIC, TessBaseAPI.OEM_DEFAULT);
baseApi.setDebug(true);
baseApi.setVariable(TessBaseAPI.VAR_CHAR_WHITELIST, "ABCDEFGHIJKLMNOPQRSTUVXWYZ1234567890");
baseApi.setPageSegMode(TessBaseAPI.PageSegMode.PSM_SINGLE_CHAR);
baseApi.setVariable("load_system_dawg", TessBaseAPI.VAR_FALSE);
baseApi.setVariable("load_freq_dawg", TessBaseAPI.VAR_FALSE);
baseApi.setVariable("load_punc_dawg", TessBaseAPI.VAR_FALSE);
baseApi.setVariable("load_number_dawg", TessBaseAPI.VAR_TRUE);
baseApi.setVariable("load_unambig_dawg", TessBaseAPI.VAR_FALSE);
baseApi.setVariable("load_bigram_dawg", TessBaseAPI.VAR_FALSE);
baseApi.setVariable("load_fixed_length_dawgs", TessBaseAPI.VAR_FALSE);
baseApi.setVariable("segment_penalty_garbage", TessBaseAPI.VAR_FALSE);
baseApi.setVariable("segment_penalty_dict_nonword", TessBaseAPI.VAR_FALSE);
baseApi.setVariable("segment_penalty_dict_frequent_word", TessBaseAPI.VAR_FALSE);
baseApi.setVariable("segment_penalty_dict_case_ok", TessBaseAPI.VAR_FALSE);
baseApi.setVariable("segment_penalty_dict_case_bad", TessBaseAPI.VAR_FALSE);
return baseApi;