Question

我在IBM Bluemix帐户中有Discovery实例，我想将本地文件夹中的文档添加到此Discovery实例中的私有集合中。我通过基本上从主本地文件夹调用递归函数来实现。程序本身很好;但是，经过几次添加文档的迭代后，我遇到了以下错误：

Aug 08, 2017 1:55:07 PM okhttp3.internal.platform.Platform log
INFO: --> POST https://gateway.watsonplatform.net/discovery/api/v1/environments/{environmentId}/collections/{collectionId}/documents?version=2017-08-01 http/1.1 (-1-byte body)
Aug 08, 2017 1:59:09 PM okhttp3.internal.platform.Platform log
INFO: <-- HTTP FAILED: java.net.SocketException: Connection reset by peer: socket write error
Aug 08, 2017 1:59:10 PM okhttp3.internal.platform.Platform log
INFO: --> POST https://gateway.watsonplatform.net/discovery/api/v1/environments/{environmentId}/collections/{collectionId}/documents?version=2017-08-01 http/1.1 (-1-byte body)
Aug 08, 2017 1:59:10 PM okhttp3.internal.platform.Platform log
INFO: <-- HTTP FAILED: java.io.IOException: Stream Closed
Exception in thread "main" java.lang.RuntimeException: java.io.IOException: Stream Closed

我的工作原理是我首先初始化Discovery实例：

Discovery discovery = new Discovery("2017-08-01");
discovery.setEndPoint("https://gateway.watsonplatform.net/discovery/api");
discovery.setUsernameAndPassword({username}, {password});

然后对于文件夹中文件类型 mimetype 的每个mime支持的文件 f ，我这样做：

CreateDocumentRequest.Builder builder = new CreateDocumentRequest.Builder({environmentId}, {collectionId}).file(f, mimetype);
CreateDocumentResponse createResponse = discovery.createDocument(builder.build()).execute();

Discovery实例在循环期间是否可能超时？我应该为每个请求初始化一个新的Discovery实例吗？

更新

我很确定由于连接问题而发生异常。现在我正在尝试添加文档，以便在连接丢失时重新初始化Discovery实例。但是，它会提供INFO: <-- HTTP FAILED: java.io.IOException: Stream Closed。

boolean successful;
do {
    try {
        CreateDocumentResponse createResponse = this.discovery.createDocument(builder.build()).execute();
        System.out.println(createResponse.toString());
        successful = true;
    } catch (Exception e) {
        System.err.println("Exception: " + e.getMessage());
        try {
            TimeUnit.MILLISECONDS.sleep(500);
        } catch (InterruptedException e1) {
            System.err.println("InterruptedException: " + e1.getMessage());
        }
        this.discovery = new Discovery("2017-08-01");
        this.discovery.setEndPoint("https://gateway.watsonplatform.net/discovery/api");
        this.discovery.setUsernameAndPassword(DataUploader.USERNAME, DataUploader.PASSWORD);
        successful = false;
    }
} while (!successful)

Answer 1

根据您所包含的内容，您的方法似乎是合理的。您不必为每个添加文档的请求创建发现类的实例。我认为这里的核心问题是处理您提供给CreateDocumentRequest.Builder的文件流。据我所知，文件流看起来过早关闭。

这是一个scala示例，它通过上传一个没有问题的文件夹中的所有文件来执行类似的操作。

import java.nio.file.{Files, Paths}
import com.ibm.watson.developer_cloud.discovery.v1.Discovery
import com.ibm.watson.developer_cloud.discovery.v1.model.document.CreateDocumentRequest
import com.ibm.watson.developer_cloud.http.HttpMediaType

object Run {
  def main(args: Array[String]): Unit = {
    if(args.length == 0) {
      println("Usage: <app> <folder-to-upload>")
      System.exit(0)
    }

    val discovery = new Discovery("2017-08-01")
    discovery.setEndPoint("https://gateway.watsonplatform.net/discovery/api")
    discovery.setUsernameAndPassword("{username}", "{password}")

    val environmentId = "<environment-id>"
    val collectionId = "<collection-id>"

    Files.list(Paths.get(args(0))).forEach { path =>
      println(s"Processing ${path.getFileName}")
      val createDocumentBuilder = new CreateDocumentRequest.Builder(environmentId, collectionId)
        .file(path.toFile, HttpMediaType.APPLICATION_JSON)
      val response = discovery.createDocument(createDocumentBuilder.build()).execute()
      println(s"DocumentID ${response.getDocumentId}")
    }
  }
}

根据提供的代码段，我无法确定您使用的是哪种方法，但如果给出选择，我会使用Builder#file(File inputFile, String mediaType)方法而不是Builder#file(InputStream content, String mediaType)。否则，在确定构建请求并将其发送到服务器之前，必须确保不关闭流。

IBM Watson：在递归地将文档添加到集合

1 个答案: