我正在尝试从s3中读取,以提供key
和bucket
来获取输入流,即S3ObjectInputStream
关于我为什么会遇到问题的任何见解,我都可以在本地运行,但是当我在EMR上运行时,我在下面遇到此错误
Caused by: org.apache.http.conn.ConnectionPoolTimeoutException: Timeout waiting for connection from pool
事情,我尝试以s3object
的身份关闭s3object.close
返回值之前。但后来我得到Exception in thread "main" java.io.IOException: Attempted read on closed stream.
所以放弃那个...
def getS3Object(s3Client: AmazonS3, bucketName: String, key: String): S3ObjectInputStream = {
val s3Object = s3Client.getObject(bucketName, key)
val objectContent = s3Object.getObjectContent
objectContent
}
线程“ main”中的异常com.amazonaws.SdkClientException:无法执行 执行HTTP请求:超时等待来自池的连接 com.amazonaws.http.AmazonHttpClient $ RequestExecutor.handleRetryableException(AmazonHttpClient.java:1175) 在 com.amazonaws.http.AmazonHttpClient $ RequestExecutor.executeHelper(AmazonHttpClient.java:1121) 在 com.amazonaws.http.AmazonHttpClient $ RequestExecutor.doExecute(AmazonHttpClient.java:770) 在 com.amazonaws.http.AmazonHttpClient $ RequestExecutor.executeWithTimer(AmazonHttpClient.java:744) 在 com.amazonaws.http.AmazonHttpClient $ RequestExecutor.execute(AmazonHttpClient.java:726) 在 com.amazonaws.http.AmazonHttpClient $ RequestExecutor.access $ 500(AmazonHttpClient.java:686) 在 com.amazonaws.http.AmazonHttpClient $ RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:668) 在 com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:532) 在 com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:512) 在 com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4914) 在 com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4860) 在 com.amazonaws.services.s3.AmazonS3Client.getObject(AmazonS3Client.java:1467) 在 com.amazonaws.services.s3.AmazonS3Client.getObject(AmazonS3Client.java:1326) 在 content.spark.ContentIngestion.getS3Object(ContentIngestion.scala:54) 在 content.spark.ContentIngestion $$ anonfun $ 1.apply(ContentIngestion.scala:45) 在 content.spark.ContentIngestion $$ anonfun $ 1.apply(ContentIngestion.scala:45) 在 scala.collection.TraversableLike $$ anonfun $ map $ 1.apply(TraversableLike.scala:234) 在 scala.collection.TraversableLike $$ anonfun $ map $ 1.apply(TraversableLike.scala:234) 在 scala.collection.IndexedSeqOptimized $ class.foreach(IndexedSeqOptimized.scala:33) 在 scala.collection.mutable.ArrayOps $ ofRef.foreach(ArrayOps.scala:186) 在 scala.collection.TraversableLike $ class.map(TraversableLike.scala:234) 在scala.collection.mutable.ArrayOps $ ofRef.map(ArrayOps.scala:186) 在 content.spark.ContentIngestion.getSolrDocuments(ContentIngestion.scala:45) 在content.spark.Main $ .main(Main.scala:57)处 content.spark.Main.main(Main.scala)位于 sun.reflect.NativeMethodAccessorImpl.invoke0(本机方法)
Caused by: org.apache.http.conn.ConnectionPoolTimeoutException: Timeout waiting for connection from pool