Question

我有一个正在写入S3文件夹的Apache Spark应用程序，因为Spark应用程序在写入S3时正在对数据进行分区，因此它添加了EQUAL符号，如下所示。

s3：// biops / XXX / YYY / entryDateYear = 2018 / entryDateMonth = 07

我完全理解S3不允许创建带有“ =”的bucket_name，但是Spark流创建了一个带有field_name的分区，其后是“ =”然后是value。

请告知如何访问等号的S3文件夹。

// Actual paths in S3 is --> biops/XXX/YYY/royalty_raw_json/entryDateYear=2018/

String bucket = "biops";
String without_equal = "XXX/YYY/royalty_raw_json/";
String with_equal = "XXX/YYY/royalty_raw_json/entryDateYear=2018";
String with_equal_encoding = "XXX/YYY/royalty_raw_json/entryDateYear%3D2018";

AmazonS3 amazonS3 = AmazonS3ClientBuilder.standard().
        withCredentials(getCredentialsProvider(credentials))
        .withRegion("us-east-1")
        .build();
amazonS3.doesObjectExist(bucket, without_equal); // Works
amazonS3.doesObjectExist(bucket, with_equal); // Not works
amazonS3.doesObjectExist(bucket, with_equal_encoding); // Not works 2

更新我设法解决了下面列出的对象的问题，以检查是否存在文件夹

ListObjectsRequest listObjectsRequest = new ListObjectsRequest().withBucketName(bucket).withPrefix(with_equal);
ObjectListing bucketListing = amazonS3.listObjects(listObjectsRequest);
if (bucketListing != null && bucketListing.getObjectSummaries() != null && bucketListing.getObjectSummaries().size() > 0)
    System.out.println("Folder present with files");
else
    System.out.println("Folder present with zero files or Folder not present");

S3 dosObjectExist API无法识别具有相同符号的路径

0 个答案: