PHP和ABBY识别保存到变量

时间:2014-01-10 12:05:20

标签: php ocr

我正在使用ABBY API进行OCR,我希望将结果输入变量以进行进一步处理,而不是将结果作为文件下载

    <?php
    include_once("dBug.php");
    // Name of application you created
    $applicationId = 'telianewtest';
    // Password should be sent to your e-mail after application was created
    $password = 'w0Ye61tWZ6fODm7hIUj9XTeJ';
    $fileName = '20080118155747372_Page_2.jpg';

      // Get path to file that we are going to recognize
      $local_directory=dirname(__FILE__).'/images/';
      $filePath = $local_directory.'/'.$fileName;
      if(!file_exists($filePath))
      {
        die('File '.$filePath.' not found.');
      }
      if(!is_readable($filePath) )
      {
         die('Access to file '.$filePath.' denied.');
      }

      // Recognizing with English language to rtf
      // You can use combination of languages like ?language=english,russian or
      // ?language=english,french,dutch
      // For details, see API reference for processImage method
      $url = 'http://cloud.ocrsdk.com/processImage?language=english&exportFormat=xml';

      // Send HTTP POST request and ret xml response
      $curlHandle = curl_init();
      curl_setopt($curlHandle, CURLOPT_URL, $url);
      curl_setopt($curlHandle, CURLOPT_RETURNTRANSFER, 1);
      curl_setopt($curlHandle, CURLOPT_USERPWD, "$applicationId:$password");
      curl_setopt($curlHandle, CURLOPT_POST, 1);
      curl_setopt($curlHandle, CURLOPT_USERAGENT, "PHP Cloud OCR SDK Sample");
      $post_array = array(
          "my_file"=>"@".$filePath,
      );
      curl_setopt($curlHandle, CURLOPT_POSTFIELDS, $post_array); 
      $response = curl_exec($curlHandle);
      if($response == FALSE) {
        $errorText = curl_error($curlHandle);
        curl_close($curlHandle);
        die($errorText);
      }
      $httpCode = curl_getinfo($curlHandle, CURLINFO_HTTP_CODE);
      curl_close($curlHandle);

      // Parse xml response
      $xml = simplexml_load_string($response);
      if($httpCode != 200) {
        if(property_exists($xml, "message")) {
           die($xml->message);
        }
        die("unexpected response ".$response);
      }

      $arr = $xml->task[0]->attributes();
      $taskStatus = $arr["status"];
      if($taskStatus != "Queued") {
        die("Unexpected task status ".$taskStatus);
      }

      // Task id
      $taskid = $arr["id"];  

      // 4. Get task information in a loop until task processing finishes
      // 5. If response contains "Completed" staus - extract url with result
      // 6. Download recognition result (text) and display it

      $url = 'http://cloud.ocrsdk.com/getTaskStatus';
      $qry_str = "?taskid=$taskid";

      // Check task status in a loop until it is finished
      // TODO: support states indicating error
      while(true)
      {
        sleep(5);
        $curlHandle = curl_init();
        curl_setopt($curlHandle, CURLOPT_URL, $url.$qry_str);
        curl_setopt($curlHandle, CURLOPT_RETURNTRANSFER, 1);
        curl_setopt($curlHandle, CURLOPT_USERPWD, "$applicationId:$password");
        curl_setopt($curlHandle, CURLOPT_USERAGENT, "PHP Cloud OCR SDK Sample");
        $response = curl_exec($curlHandle);
        $httpCode = curl_getinfo($curlHandle, CURLINFO_HTTP_CODE);
        curl_close($curlHandle);

        // parse xml
        $xml = simplexml_load_string($response);
        if($httpCode != 200) {
          if(property_exists($xml, "message")) {
            die($xml->message);
          }
          die("Unexpected response ".$response);
        }
        $arr = $xml->task[0]->attributes();
        $taskStatus = $arr["status"];
        if($taskStatus == "Queued" || $taskStatus == "InProgress") {
          // continue waiting
          continue;
        }
        if($taskStatus == "Completed") {
          // exit this loop and proceed to handling the result
          break;
        }
        if($taskStatus == "ProcessingFailed") {
          die("Task processing failed: ".$arr["error"]);
        }
        die("Unexpected task status ".$taskStatus);
      }

      // Result is ready. Download it

      $url = $arr["resultUrl"];   
      $curlHandle = curl_init();
      curl_setopt($curlHandle, CURLOPT_URL, $url);
      curl_setopt($curlHandle, CURLOPT_RETURNTRANSFER, 1);
      // Warning! This is for easier out-of-the box usage of the sample only.
      // The URL to the result has https:// prefix, so SSL is required to
      // download from it. For whatever reason PHP runtime fails to perform
      // a request unless SSL certificate verification is off.
      curl_setopt($curlHandle, CURLOPT_SSL_VERIFYPEER, false);
      $response = curl_exec($curlHandle);

      curl_close($curlHandle);


      // Let user donwload rtf result
      header('Content-type: application/txt');
      header('Content-Disposition: attachment; filename="file.xml"');
      echo $response;
    ?>

我试图访问$ xml变量,现在成功了...任何想法? 提前谢谢

(我已经包含了密码,因为它是一个模拟账户,如果需要,可以查看)

0 个答案:

没有答案