我使用spring integration aws来轮询S3资源并从S3存储桶中获取文件并使用spring集成处理它们。 以下是我所拥有的:
AmazonS3 amazonS3 = new AmazonS3Client(new BasicAWSCredentials(accessKey, secretKey));
@Bean
IntegrationFlow fileReadingFlow() {
return IntegrationFlows
.from(s3InboundFileSynchronizingMessageSource(),
e -> e.poller(p -> p.fixedDelay(30, TimeUnit.SECONDS)))
.handle(receiptProcessor())
.get();
}
@Bean
public S3InboundFileSynchronizer s3InboundFileSynchronizer() {
S3InboundFileSynchronizer synchronizer = new S3InboundFileSynchronizer(amazonS3);
synchronizer.setDeleteRemoteFiles(false);
synchronizer.setPreserveTimestamp(true);
synchronizer.setRemoteDirectory(s3BucketName.concat("/").concat(s3InboundFolder));
synchronizer.setFilter(new S3RegexPatternFileListFilter(".*\\.dat\\.{0,1}\\d{0,2}"));
return synchronizer;
}
@Bean
public S3InboundFileSynchronizingMessageSource s3InboundFileSynchronizingMessageSource() {
S3InboundFileSynchronizingMessageSource messageSource =
new S3InboundFileSynchronizingMessageSource(s3InboundFileSynchronizer());
messageSource.setAutoCreateLocalDirectory(false);
messageSource.setLocalDirectory(new File(inboundDir));
messageSource.setLocalFilter(new AcceptOnceFileListFilter<File>());
return messageSource;
}
我的S3存储桶和密钥是:
bucketName = shipmentReceipts
key = receipts/originalReceipts/inbound/receipt1.dat
所以我面临这个实施的2个问题:
1.将inboundDir文件夹名称重命名为不同的路径名,并附加s3key,从而生成FileNotFoundException
。我将此跟踪到AbstractInboundFileSynchronizer.java
文件中的以下代码:
protected void copyFileToLocalDirectory(String remoteDirectoryPath, F remoteFile, File localDirectory,
Session<F> session) throws IOException {
String remoteFileName = this.getFilename(remoteFile);
String localFileName = **this.generateLocalFileName(remoteFileName);**
String remoteFilePath = remoteDirectoryPath != null
? (remoteDirectoryPath + this.remoteFileSeparator + remoteFileName)
: remoteFileName;
if (!this.isFile(remoteFile)) {
if (this.logger.isDebugEnabled()) {
this.logger.debug("cannot copy, not a file: " + remoteFilePath);
}
return;
}
**File localFile = new File(localDirectory, localFileName);**
if (!localFile.exists()) {........
所以它最终会找到一个文件路径C:\ SpringAws \ S3inbound \ receipts \ originalReceipts \ inbound \ receipt1.dat,它找不到并发出FileNotFoundException
错误。相反,它应该只是复制到本地文件夹C:\ SpringAws \ S3inbound \ receipt1.dat
拉动S3对象时,我注意到它正在shipmentReceipts/receipts
而不是shipmentReceipts/receipts/originalReceipts/inbound
下拉所有对象
在进一步调试时,我发现S3Session.java
中的以下代码片段负责:
@Override
public S3ObjectSummary[] list(String path) throws IOException {
Assert.hasText(path, "'path' must not be empty String.");
String[] bucketPrefix = path.split("/");
Assert.state(bucketPrefix.length > 0 && bucketPrefix[0].length() >= 3,
"S3 bucket name must be at least 3 characters long.");
String bucket = resolveBucket(bucketPrefix[0]);
ListObjectsRequest listObjectsRequest = new ListObjectsRequest()
.withBucketName(bucket);
if (bucketPrefix.length > 1) {
**listObjectsRequest.setPrefix(bucketPrefix[1]);**
}
/*
For listing objects, Amazon S3 returns up to 1,000 keys in the response.
If you have more than 1,000 keys in your bucket, the response will be truncated.
You should always check for if the response is truncated.
*/
ObjectListing objectListing;
List<S3ObjectSummary> objectSummaries = new ArrayList<>();
do {......
它在遇到第一个正斜杠/
之后为所有内容设置前缀。
我该如何减轻这些?谢谢!
答案 0 :(得分:0)
嵌套路径的第一个问题是已知问题,并已在最新的5.0 M3
:https://spring.io/blog/2017/04/05/spring-integration-5-0-milestone-3-available中使用RecursiveDirectoryScanner
修复。
同时您必须将LocalFilenameGeneratorExpression
指定为:
Expression expression = PARSER.parseExpression("#this.contains('/') ? #this.substring(#this.lastIndexOf('/') + 1) : #this");
synchronizer.setLocalFilenameGeneratorExpression(expression);
S3ObjectSummary
包含key
作为完整路径而没有bucket
。
第二个&#34;嵌套路径&#34;问题已通过以下方式修复:https://github.com/spring-projects/spring-integration-aws/issues/45。该修复程序位于1.1.0.M1
:https://spring.io/blog/2017/03/09/spring-integration-extension-for-aws-1-1-0-m1-available
答案 1 :(得分:0)
根据Artem,我确实使用了spring-integration-aws的最新里程碑版本,但发现编写一个扩展AbstractInboundFileSynchronizer的自定义类来解决我的问题更容易。 继承我创建的课程:
public class MyAbstractInboundFileSynchronizer extends AbstractInboundFileSynchronizer<S3ObjectSummary> {
private volatile String remoteFileSeparator = "/";
private volatile String temporaryFileSuffix = ".writing";
private volatile boolean deleteRemoteFiles;
private volatile boolean preserveTimestamp;
private volatile FileListFilter<S3ObjectSummary> filter;
private volatile Expression localFilenameGeneratorExpression;
private volatile EvaluationContext evaluationContext;
@Override
public void setLocalFilenameGeneratorExpression(Expression localFilenameGeneratorExpression) {
super.setLocalFilenameGeneratorExpression(localFilenameGeneratorExpression);
this.localFilenameGeneratorExpression = localFilenameGeneratorExpression;
}
@Override
public void setIntegrationEvaluationContext(EvaluationContext evaluationContext) {
super.setIntegrationEvaluationContext(evaluationContext);
this.evaluationContext = evaluationContext;
}
@Override
public void setRemoteFileSeparator(String remoteFileSeparator) {
super.setRemoteFileSeparator(remoteFileSeparator);
this.remoteFileSeparator = remoteFileSeparator;
}
public MyAbstractInboundFileSynchronizer() {
this(new S3SessionFactory());
}
public MyAbstractInboundFileSynchronizer(AmazonS3 amazonS3) {
this(new S3SessionFactory(amazonS3));
}
/**
* Create a synchronizer with the {@link SessionFactory} used to acquire {@link Session} instances.
* @param sessionFactory The session factory.
*/
public MyAbstractInboundFileSynchronizer(SessionFactory<S3ObjectSummary> sessionFactory) {
super(sessionFactory);
setRemoteDirectoryExpression(new LiteralExpression(null));
setFilter(new S3PersistentAcceptOnceFileListFilter(new SimpleMetadataStore(), "s3MessageSource"));
}
@Override
public final void setRemoteDirectoryExpression(Expression remoteDirectoryExpression) {
super.setRemoteDirectoryExpression(remoteDirectoryExpression);
}
@Override
public final void setFilter(FileListFilter<S3ObjectSummary> filter) {
super.setFilter(filter);
}
@Override
protected boolean isFile(S3ObjectSummary file) {
return true;
}
@Override
protected String getFilename(S3ObjectSummary file) {
if(file != null){
String key = file.getKey();
String fileName = key.substring(key.lastIndexOf('/')+1);
return fileName;
}
else return null;
}
@Override
protected long getModified(S3ObjectSummary file) {
return file.getLastModified().getTime();
}
@Override
protected void copyFileToLocalDirectory(String remoteDirectoryPath, S3ObjectSummary remoteFile, File localDirectory,
Session<S3ObjectSummary> session) throws IOException {
String remoteFileName = this.getFilename(remoteFile);
//String localFileName = this.generateLocalFileName(remoteFileName);
String localFileName = remoteFileName;
String remoteFilePath = remoteDirectoryPath != null
? (remoteDirectoryPath + remoteFileName)
: remoteFileName;
if (!this.isFile(remoteFile)) {
if (this.logger.isDebugEnabled()) {
this.logger.debug("cannot copy, not a file: " + remoteFilePath);
}
return;
}
File localFile = new File(localDirectory, localFileName);
if (!localFile.exists()) {
String tempFileName = localFile.getAbsolutePath() + this.temporaryFileSuffix;
File tempFile = new File(tempFileName);
OutputStream outputStream = new BufferedOutputStream(new FileOutputStream(tempFile));
try {
session.read(remoteFilePath, outputStream);
}
catch (Exception e) {
if (e instanceof RuntimeException) {
throw (RuntimeException) e;
}
else {
throw new MessagingException("Failure occurred while copying from remote to local directory", e);
}
}
finally {
try {
outputStream.close();
}
catch (Exception ignored2) {
}
}
if (tempFile.renameTo(localFile)) {
if (this.deleteRemoteFiles) {
session.remove(remoteFilePath);
if (this.logger.isDebugEnabled()) {
this.logger.debug("deleted " + remoteFilePath);
}
}
}
if (this.preserveTimestamp) {
localFile.setLastModified(getModified(remoteFile));
}
}
}
}
我还根据Artem更新了LocalFilenameGeneratorExpression
。谢谢!
答案 2 :(得分:0)
@ user5758361您使用嵌套路径描述的第一个问题也可以通过覆盖S3FileInfo
来解决:
public class S3FileInfo extends org.springframework.integration.aws.support.S3FileInfo {
private static final ObjectWriter OBJECT_WRITER = new ObjectMapper().writerFor(S3ObjectSummary.class);
public S3FileInfo(S3ObjectSummary s3ObjectSummary) {
super(s3ObjectSummary);
}
@Override
public String getFilename() {
return FilenameUtils.getName(super.getFilename());
}
@Override
public String toJson() {
try {
return OBJECT_WRITER.writeValueAsString(super.getFileInfo());
} catch (JsonProcessingException e) {
throw new UncheckedIOException(e);
}
}
}
重写 toJson
以避免某些对象的NPE。
将其用于流式传输:
public class S3StreamingMessageSource extends org.springframework.integration.aws.inbound.S3StreamingMessageSource {
public S3StreamingMessageSource(RemoteFileTemplate<S3ObjectSummary> template) {
super(template, null);
}
public S3StreamingMessageSource(RemoteFileTemplate<S3ObjectSummary> template,
Comparator<AbstractFileInfo<S3ObjectSummary>> comparator) {
super(template, comparator);
}
@Override
protected List<AbstractFileInfo<S3ObjectSummary>> asFileInfoList(Collection<S3ObjectSummary> collection) {
return collection.stream()
.map(S3FileInfo::new)
.collect(toList());
}
}
BTW,我正在使用Spring集成5.0.0.M4和Spring Integration AWS 1.1.0.M2,在使用像abc/def/
这样的存储桶名称时仍然存在同样的问题