Google Cloud Vision API - “image-annotator ::格式错误的请求:图像处理错误”

时间:2016-09-21 12:18:02

标签: pdfbox apache-tika google-cloud-vision

我在查询Google Vision API时遇到错误:

 {
      "responses" : [ {
        "error" : {
          "code" : 3,
          "message" : "image-annotator::Malformed request.: Image processing error!"
        }
      } ]
    }

我已通过包含图片的pdf文件,然后使用pdfbox提取图片以创建AnnotateImageRequest列表

List<AnnotateImageRequest> visionRequests = new ArrayList<>();
PDDocument document = PDDocument.load(pdfDatastream);
for (PDPage page : document.getPages()) {
    PDResources resources = page.getResources();
    for (COSName xObjectName : resources.getXObjectNames()) {
        PDXObject pdxObject = resources.getXObject(xObjectName);
            if (pdxObject instanceof PDImageXObject) {
                byte[] imageArray = Base64.encodeBase64(IOUtils.toByteArray(((PDImageXObject) pdxObject).createInputStream()));
                System.out.println("image >>"+imageArray.length);
                Image image = new Image();
                image.encodeContent(imageArray);

                Feature feature = new Feature();
                feature.setType("TEXT_DETECTION");

                AnnotateImageRequest annotateImageRequest = new AnnotateImageRequest();
                annotateImageRequest.setImage(image);
                annotateImageRequest.setFeatures(Arrays.asList(feature));
                visionRequests.add(annotateImageRequest);
            }
    }
}

并将上面创建的列表传递给视觉服务:

BatchAnnotateImagesResponse visionSrvcResponse = visionSrvc.images().annotate(new BatchAnnotateImagesRequest().setRequests(visionRequests)).execute();
System.out.println(visionSrvcResponse.toPrettyString());

我也尝试删除图像bytearray的base64编码,但仍然会在顶部列出相同的错误。字节数组长度为“ 774800

是否有一些我缺少的东西,因为当我将图像多部分到servlet并传递从输入流获得的bytearray时,它工作正常。

我在Tomcat V8上运行应用程序

使用的依赖:

<dependency>
    <groupId>org.apache.tika</groupId>
    <artifactId>tika-core</artifactId>
    <version>1.13</version>
</dependency>
<dependency>
    <groupId>org.apache.tika</groupId>
    <artifactId>tika-parsers</artifactId>
    <version>1.13</version>
</dependency>
<dependency>
    <groupId>com.google.apis</groupId>
    <artifactId>google-api-services-vision</artifactId>
    <version>v1-rev24-1.22.0</version>
</dependency>

1 个答案:

答案 0 :(得分:1)

感谢Tilman Hausherr

我根据他的建议改变了我的代码并且它有效:

if (pdxObject instanceof PDImageXObject) {
    ByteArrayOutputStream baos = new ByteArrayOutputStream();
    ImageIO.write( ((PDImageXObject) pdxObject).getImage(), "jpg", baos );
    baos.flush();
    byte[] imageInByte = baos.toByteArray();

    Image image = new Image();
    image.encodeContent(imageInByte);

    Feature feature = new Feature();
    feature.setType("TEXT_DETECTION");

    AnnotateImageRequest annotateImageRequest = new AnnotateImageRequest();
    annotateImageRequest.setImage(image);
    annotateImageRequest.setFeatures(Arrays.asList(feature));
    visionRequests.add(annotateImageRequest);
}