OCR在大型图像(文本很多)上效果不佳 - Google Cloud Vision API

时间:2016-11-30 16:39:27

标签: ocr google-cloud-vision

我们注意到,如果图片包含大量文字,Google Vision API效果不佳。 它会返回'奇怪'的结果。

这是一个例子:

https://www.dropbox.com/s/vhqxxwgj4stvfc9/screenwithproblem.jpg?dl=0 - 将返回如下内容:https://www.dropbox.com/s/r3gkn38rw36agvs/Screenshot%202016-11-30%2011.26.20.jpg?dl=0

如果我们只发送该图像的一部分,一切都会好的。它也可以通过API的演示页面进行检查(cloud.google.com/vision)。

我们尝试了不同的图像并遇到了同样的问题。

如果我们做错了或者这是谷歌方面的问题,你能告诉我们吗?

先谢谢你了!

1 个答案:

答案 0 :(得分:0)

我注意到文档中有一些相同的“奇怪结果”,特别是在打印褪色或模糊的文档质量较低的区域。似乎在某些情况下,API猜测文本的语言不正确。

结果的每一页都应该告诉您,某些语言被检测到占页面的百分比。

    private  SessionStateBehavior GetDesiredSessionBehavior(HttpContext httpCtx)
    {
        var config = GlobalConfiguration.Configuration;
        var diResolver = config.Services;
        var ctrlSel = diResolver.GetService(typeof(IHttpControllerSelector)) as IHttpControllerSelector;
         var actionSel = diResolver.GetService(typeof(IHttpActionSelector)) as IHttpActionSelector;

        if (ctrlSel is null || actionSel is null)
        {
            return DefaultSessionBehavior;
        }


        var method = new HttpMethod(httpCtx.Request.HttpMethod);
        var requestMsg = new HttpRequestMessage(method, httpCtx.Request.Url);
        requestMsg.Properties.Add(HttpPropertyKeys.RequestContextKey, httpCtx.Request.RequestContext);
        requestMsg.Properties.Add(HttpPropertyKeys.HttpConfigurationKey, config);
        httpCtx.Request.Headers.Cast<string>().ForEach(x => requestMsg.Headers.Add(x, httpCtx.Request.Headers[x]));


        var httpRouteData = httpCtx.Request.RequestContext.RouteData;
        var routeData = config.Routes.GetRouteData(requestMsg);
        requestMsg.Properties.Add(HttpPropertyKeys.HttpRouteDataKey, routeData);
        requestMsg.SetRequestContext(new HttpRequestContext(){RouteData = routeData });
        requestMsg.SetConfiguration(config);
        var route = config.Routes["DefaultApi"];
        requestMsg.SetRouteData(routeData ?? route.GetRouteData(config.VirtualPathRoot, requestMsg));

        var routeHandler = httpRouteData.RouteHandler ?? new WebApiConfig.SessionStateRouteHandler();
        var httpHandler = routeHandler.GetHttpHandler(httpCtx.Request.RequestContext);
        if (httpHandler is IHttpAsyncHandler httpAsyncHandler)
        {
            httpAsyncHandler.BeginProcessRequest(httpCtx, ar => httpAsyncHandler.EndProcessRequest(ar), null);
        }
        else
        {
            httpHandler.ProcessRequest(httpCtx);
        }

        var values = requestMsg.GetRouteData().Values; // Hm this is empty and makes the next call fail...
        HttpControllerDescriptor controllerDescriptor = ctrlSel.SelectController(requestMsg);

        IHttpController controller = controllerDescriptor?.CreateController(requestMsg);
        if (controller == null)
        {
            return DefaultSessionBehavior;
        }

        var ctrlContext = CreateControllerContext(requestMsg, controllerDescriptor, controller);
        var actionCtx = actionSel.SelectAction(ctrlContext);
        var attr = actionCtx.GetCustomAttributes<ActionSessionStateAttribute>().FirstOrDefault();
        return attr?.Behavior ?? DefaultSessionBehavior;
    }

在这种情况下,您可能需要尝试使用预定义的语言列表(或一种语言,如果已知)以减少检测到的错误语言的数量。 (https://cloud.google.com/nodejs/docs/reference/vision/0.22.x/google.cloud.vision.v1p1beta1#.AnnotateImageRequest