我正在使用FlatFileItemReader并扩展了AbstractResource以从Amazon S3对象返回流。
S3Object amazonS3Object = s3client.getObject(new GetObjectRequest(bucket,file));
InputStream stream = null;
stream = amazonS3Object.getObjectContent();
return stream;
在我的批处理作业中,我还实现了MultiFileResourcePartitioner,其中我给了存储桶来分区所有文件。我只能读取一些文件的一部分,之后我得到一个套接字重置错误。见下面的错误
.ResourcelessTransactionManager$ResourcelessTransaction@122ba881]
2015-08-24 23:24:03 DEBUG RepeatTemplate:366 - Repeat operation about to start at count=9
2015-08-24 23:24:03 DEBUG StepContextRepeatCallback:68 - Preparing chunk execution for StepContext: org.springframework.batch.core.scope.context.StepContext@252ce07a
2015-08-24 23:24:03 DEBUG StepContextRepeatCallback:76 - Chunk execution starting: queue size=0
2015-08-24 23:24:03 DEBUG ResourcelessTransactionManager:367 - Creating new transaction with name [null]: PROPAGATION_REQUIRED,ISOLATION_DEFAULT
2015-08-24 23:24:03 DEBUG RepeatTemplate:464 - Starting repeat context.
2015-08-24 23:24:03 DEBUG RepeatTemplate:366 - Repeat operation about to start at count=1
2015-08-24 23:24:03 DEBUG RepeatTemplate:366 - Repeat operation about to start at count=2
2015-08-24 23:24:03 DEBUG RepeatTemplate:366 - Repeat operation about to start at count=3
2015-08-24 23:24:03 DEBUG RepeatTemplate:366 - Repeat operation about to start at count=4
2015-08-24 23:24:03 DEBUG DefaultClientConnection:160 - Connection 0.0.0.0:58171<->10.37.135.39:8099 shut down
2015-08-24 23:24:03 DEBUG DefaultClientConnection:176 - Connection 0.0.0.0:58171<->10.37.135.39:8099 closed
Caused by: org.springframework.batch.item.file.NonTransientFlatFileException: Unable to read from resource: [null]
at org.springframework.batch.item.file.FlatFileItemReader.readLine(FlatFileItemReader.java:220)
at org.springframework.batch.item.file.FlatFileItemReader.doRead(FlatFileItemReader.java:173)
at org.springframework.batch.item.support.AbstractItemCountingItemStreamItemReader.read(AbstractItemCountingItemStreamItemReader.java:83)
at sun.reflect.GeneratedMethodAccessor35.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:317)
at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:190)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:157)
at org.springframework.aop.support.DelegatingIntroductionInterceptor.doProceed(DelegatingIntroductionInterceptor.java:133)
at org.springframework.aop.support.DelegatingIntroductionInterceptor.invoke(DelegatingIntroductionInterceptor.java:121)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179)
at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:207)
at com.sun.proxy.$Proxy17.read(Unknown Source)
at org.springframework.batch.core.step.item.SimpleChunkProvider.doRead(SimpleChunkProvider.java:91)
at org.springframework.batch.core.step.item.FaultTolerantChunkProvider.read(FaultTolerantChunkProvider.java:87)
... 22 more
Caused by: java.net.SocketException: Connection reset
at java.net.SocketInputStream.read(SocketInputStream.java:196)
at java.net.SocketInputStream.read(SocketInputStream.java:122)</i>
我的要求是处理来自S3存储桶的数百万条记录的文件,并且应用程序在AWS上运行。我已经通过重试和打开连接传递了S3客户端配置,但没有多大帮助。
答案 0 :(得分:0)
正如@Michael Minella所提到的,使用Spring Cloud AWS项目获取资源可能是您的选择:
@Autowired
private ResourcePatternResolver resourcePatternResolver;
public void resolveAndLoad() throws IOException {
Resource[] allTxtFilesInFolder = this.resourcePatternResolver.getResources("s3://bucket/name/*.txt");
Resource[] allTxtFilesInBucket = this.resourcePatternResolver.getResources("s3://bucket/**/*.txt");
Resource[] allTxtFilesGlobally = this.resourcePatternResolver.getResources("s3://**/*.txt");
}
然后将资源传递给MultiFileResourcePartitioner以查看异常是否可以解决。