将客户端REQUEST_ENTITY_PROCESSING设置为CHUNKED后,我丢失了文档

时间:2016-10-02 15:27:13

标签: java http jersey streaming

我有一个在Jetty上运行的REST Web服务。我想编写一个Java客户端,它使用相同的Web连接将大量文档分块到该休息服务。

我能够在这里建立基于迭代器的流媒体方法:

Sending a stream of documents to a Jersey @POST endpoint

除非您设置clientConfig.property(ClientProperties.REQUEST_ENTITY_PROCESSING, RequestEntityProcessing.CHUNKED);因为Content-length未知,否则无效。

虽然有些工作,但分块转移似乎丢失了一些文件。例如:

num_docs 500000
numFound 499249

也许它正在发送像:

这样的块

{some:doc}, {some:doc}, {some:doc}, {some:doc}, {some:doc}, {some:doc}, {some:do

所以我每次都在输掉一些?更新:我错了。

如何让它不那样做?还有什么想法可能会发生什么?

    ClientConfig clientConfig = new ClientConfig();
    clientConfig.property(ClientProperties.CONNECT_TIMEOUT, (int)TimeUnit.SECONDS.toMillis(60));
    clientConfig.property(ClientProperties.REQUEST_ENTITY_PROCESSING, RequestEntityProcessing.CHUNKED);
    clientConfig.property(ClientProperties.ASYNC_THREADPOOL_SIZE, 100);
    clientConfig.property(ApacheClientProperties.CONNECTION_MANAGER, HttpClientFactory.createConnectionManager(name,
      metricRegistry, configuration));
    ApacheConnectorProvider connector = new ApacheConnectorProvider();
    clientConfig.connectorProvider(connector);
    clientConfig.register(new ClientRequestFilter() {
    @Override
    public void filter(ClientRequestContext requestContext) throws IOException {
      List<Object> orig = requestContext.getHeaders().remove(HttpHeaders.CONTENT_LENGTH);
      if (orig != null && !orig.isEmpty()) {
        requestContext.getHeaders().addAll("Length", orig);
      }
    }
    });
    clientConfig.register(new ClientRequestFilter() {
    @Override
    public void filter(ClientRequestContext requestContext) throws IOException {
      if (requestContext.getMediaType() != null &&
          requestContext.getMediaType().getType() != null &&
          requestContext.getMediaType().getType().equalsIgnoreCase("multipart")) {
        final MediaType boundaryMediaType = Boundary.addBoundary(requestContext.getMediaType());
        if (boundaryMediaType != requestContext.getMediaType()) {
          requestContext.getHeaders().putSingle(HttpHeaders.CONTENT_TYPE, boundaryMediaType.toString());
        }
        if (!requestContext.getHeaders().containsKey("MIME-Version")) {
          requestContext.getHeaders().putSingle("MIME-Version", "1.0");
        }
      }
    }
    });

1 个答案:

答案 0 :(得分:3)

关闭这个 - 我不小心意外地关闭了流,所以它真的丢失了文档在最后,它给了我一个提示,等待阻塞队列为空,然后关闭执行程序。