如何在tesseract OCRing停止?

时间:2012-08-27 08:22:17

标签: iphone block ocr tesseract

我在我的iPhone应用程序中使用tesseract进行图像OCRing。 我希望在运行时停止所有OCR进程。

这是我的代码:

在.h文件中

dispatch_queue_t main;
tesseract::TessBaseAPI *tesseract;
uint32_t *pixels;
在.m文件中

- (void)processOcrAt:(UIImage *)image
{
    [self setTesseractImage:image];

    //char* utf8Text = tesseract->GetUTF8Text();
    //[self performSelector:@selector(ocrProcessingFinished:) withObject:[NSString stringWithUTF8String:utf8Text]];
    //dispatch_queue_t queue = dispatch_queue_create("com.awesome", 0);

    main = dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0);
    dispatch_async(main, ^{
        tesseract->Recognize(NULL);
        char* utf8Text = tesseract->GetUTF8Text();
        [self performSelectorOnMainThread:@selector(ocrProcessingFinished:)
                               withObject:[NSString stringWithUTF8String:utf8Text]
                               waitUntilDone:NO];
        delete [] utf8Text;
    });


}

-(IBAction)backPressed:(id)sender{
    dispatch_release(main);
    tesseract->Clear();
    //tesseract->End();

    delete tesseract;
    tesseract = nil;
    delete pixels;
    [self.navigationController popViewControllerAnimated:YES];
}

当我在ocr运行时轻触后退按钮时,它会崩溃。因为ocr仍然在运行。我怎么能阻止它?我在tesseract找不到任何方法。

3 个答案:

答案 0 :(得分:1)

Recognize()函数的ETEXT_DESC参数怎么样? (当你写下你的答案时,不确定它是否存在?fulberto100)。它是一种监视器用于解决进度以及取消它的过程。它用在TessBaseAPI :: ProcessPage中。我自己没试过。

ETEXT_DESC monitor;
monitor.cancel = NULL;
monitor.cancel_this = NULL;
monitor.set_deadline_msecs(timeout_millisec);
// Now run the main recognition.
failed = Recognize(&monitor) < 0;

答案 1 :(得分:0)

这是tesseract表格的答案: https://groups.google.com/forum/?fromgroups=#!topic/tesseract-ocr/1uLF4BmmmUg

我认为问题的关键在于你试图在执行中的随机位置停止OCR线程,但期望Tesseract实例的状态是一致的。你想删除实例是正确的,否则你会有内存泄漏,但看起来你不能在异常停止OCR线程后这样做。在我们自己的iPhone应用程序(ScanBizCards)中,我们在这种情况下做的是让OCR线程在后台完成其工作,即使其结果将被忽略并且不向用户显示。缺点主要是如果用户在中止一次扫描后立即开始新的扫描,我们会延迟新扫描的开始,直到上一次(中止)扫描完成。

答案 2 :(得分:0)

该程序使用两个线程解释Tesseract页面处理的进度:

#include <baseapi.h>
#include <allheaders.h>
#include <iostream>
#include <thread>
using namespace std;
using namespace tesseract;           

//monitorProgress will show actual progress done by tesseract
void monitorProgress();

//Here image send to extract text
void tesseractProcessing();

TessBaseAPI *api;
ETEXT_DESC *monitor = new ETEXT_DESC();

int main()
{
    //This statement will launch multiple threads in loop
    thread t1(tesseractProcessing);
    thread t2(monitorProgress);
    std::cout << "The main function execution\n";
    t1.join();
    t2.join();
    return 0;
}

void monitorProgress()
{
    while (1)
    {        
        cout << "Current Progress :   " << monitor[0].progress << endl;      
    }
}     

void tesseractProcessing()
{
    api = new TessBaseAPI(); 
    Pix *image = pixRead("myimage.jpg");
    api->Init("tessdata", "eng", OEM_DEFAULT);      
    api->SetPageSegMode(PSM_AUTO);
    api->SetImage(image);

    api->Recognize(monitor);

    cout << "out from recognition"<<endl;

    ofstream myfile("myfile.html");
    if (myfile.is_open())
    {
        myfile << api->GetHOCRText(0);
    }
    myfile.close();
}