使用Google云端硬盘在Android中集成OCR

时间:2013-04-20 04:09:28

标签: android ocr google-drive-api

我搜索了很多,但我还没有成功。我不想使用Tesseract medhod。

我已根据

创建了该项目

https://developers.google.com/drive/quickstart-android

并且在运行图像时上传到护目镜文档。我收到了使用文件名成功上传文件的消息。

用于下载我遵循的数据

public void downloadfile(File file)
  {
      String imageAsTextUrl = file.getExportLinks().get("text/plain");

      HttpClient client = new DefaultHttpClient();
      HttpGet get = new HttpGet(imageAsTextUrl);
      HttpResponse response;

      StringBuffer sb = new StringBuffer();
      BufferedReader in = null;
      try 
      {
          response = client.execute(get);
          in = new BufferedReader(new InputStreamReader(response.getEntity().getContent()));
          String str;
          while ((str = in.readLine()) != null) 
          {
              sb.append(str);
          }
          Log.e(" sb data ", sb.toString());
          in.close();
      } 
      catch (ClientProtocolException e) 
      {
          e.printStackTrace();
      } 
      catch (IOException e) 
      {
          e.printStackTrace();
      }
  }

我得到了返回数据

<!DOCTYPE html><html lang="en"> <head> <meta charset="utf-8"> <title>Welcome to Google Docs</title><style type="text/css"> html, body, div, h1, h2, h3, h4, h5, h6, p, img, dl, dt, dd, ol, ul, li, table, tr, td, form, object, embed, article, aside, canvas, command, details, fieldset, figcaption, figure, footer, group, header, hgroup, legend, mark, menu, meter, nav, output, progress, section, summary, time, audio, video { margin: 0; padding: 0; border: 0; } article, aside, details, figcaption, figure, footer, header, hgroup, menu, nav, section { display: block; } html { font: 81.25% arial, helvetica, sans-serif; background: #fff; color: #333; line-height: 1; direction: ltr; } a { color: #15c; text-decoration: none; } a:active { color: #d14836; } a:hover { text-decoration: underline; } h1, h2, h3, h4, h5, h6 { color: #222; font-size: 1.54em; font-weight: normal; line-height: 24px; margin: 0 0 .46em; } p { line-height: 17px; margin: 0 0 1em; } ol, ul { list-style: none; line-height: 17px; margin: 0 0 1em; } li { margin: 0 0 .5em; } table { border-collapse: collapse; border-spacing: 0; } strong { color: #222; }</style><style type="text/css"> html, body { position: absolute; height: 100%; min-width: 100%; } .wrapper { position: relative; min-height: 100%; } .wrapper + style + iframe { display: none; } .content { padding: 0 44px; } .topbar { text-align: right; padding-top: .5em; padding-bottom: .5em; } .google-header-bar { height: 71px; background: #f1f1f1; border-bottom: 1px solid #e5e5e5; overflow: hidden; } .header .logo { margin: 17px 0 0; float: left; } .header .signin, .header .signup { margin: 28px 0 0; float: right; font-weight: bold; } .header .signin-button, .header .signup-button { margin: 22px 0 0; float: right; } .header .signin-button a { font-size: 13px; font-weight: normal; } .header .signup-button a { position: relative; top: -1px; margin: 0 0 0 1em; } .main { margin: 0 auto; width: 650px; padding-top: 23px; padding-bottom: 100px; } .main h1:first-child { margin: 0 0 .92em; } .google-footer-bar { position: absolute; bottom: 0; height: 35px; width: 100%; border-top: 1px solid #ebebeb; overflow: hidden; } .footer { padding-top: 9px; font-size: .85em; white-space: nowrap; line-height: 0; } .footer ul { color: #999; float: left; max-width: 80%; } .footer ul li { display: inline; padding: 0 1.5em 0 0; } .footer a { color: #333; } .footer .lang-chooser-wrap { float: right; max-width: 20%; } .footer .lang-chooser-wrap img { vertical-align: middle; } .footer .attribution { float: right; } .footer .attribution span { vertical-align: text-top; } .redtext { color: #dd4b39; } .greytext { color: #555; } .secondary { font-size: 11px; color: #666; } .source { color: #093; } .hidden { display: none; } .announce-bar { position: absolute; bottom: 35px; height: 33px; z-index: 2; width: 100%; background: #f9edbe; border-top: 1px solid #efe1ac; border-bottom: 1px solid #efe1ac; overflow: hidden; } .announce-bar .message { font-size: .85em; line-height: 33px; margin: 0; } .announce-bar .message .separated { margin-left: 1.5em; } .announce-bar-ac { background: #eee; border-top: 1px solid #e5e5e5; border-bottom: 1px solid #e5e5e5; } .clearfix:after { visibility: hidden; display: block; font-size: 0; content: '.'; clear: both; height: 0; } * html .clearfix { zoom: 1; } *:first-child+html .clearfix { zoom: 1; } pre { font-family: monospace; position: absolute; left: 0; margin: 0; padding: 1.5em; font-size: 13px; background: #f1f1f1; border-top: 1px solid #e5e5e5; direction: ltr; }</style><style type="text/css"> button, input, select, textarea { font-family: inherit; font-size: inherit; } button::-moz-focus-inner, input::-moz-focus-inner { border: 0; } input[type=email], input[type=number], input[type=password], input[type=tel], input[type=text], input[type=url] { -webkit- "

但这不会返回图片中的文字,我上传了。如何获取文本,我上传为图像?

当我尝试

Android Open and Save files to/from Google Drive SDK

下载,我收到了错误 “DriveRequest driveRequest = (DriveRequest) request;” 它无法将HttpRequest强制转换为DriveRequest

然后我尝试了

DriveRequest driveRequest = DriveRequest.class.cast(request);

但是我在这行中发生了崩溃,说ClassCastExcaption。

欢迎任何相关的答案。在此先感谢。

1 个答案:

答案 0 :(得分:0)

我通过添加

得到了我的答案
get.setHeader("Authorization", "Bearer " + credential.getToken());

downloadfile方法。

由于我的身份验证不正确,因此返回了Google文档欢迎页面的HTML。添加此行后,我正确地获得了识别的文本。