如何在Java中从S3读取Snappy压缩文件

时间:2015-04-23 07:04:42

标签: java hadoop amazon-s3 snappy

目前我们在Hadoop中运行MapReduce作业,其输出被压缩为SnappyCompression。然后我们将输出文件移动到S3。现在我想从S3读取压缩文件。

1 个答案:

答案 0 :(得分:0)

我找到了从S3读取snappy压缩文件的答案。首先,您应该从S3获取对象内容。然后解压缩文件。

    S3Object s3object = s3Client.getObject(new GetObjectRequest(bucketName,Path));
    InputStream inContent = s3object.getObjectContent();
    CompressionCodec codec = (CompressionCodec) ReflectionUtils.newInstance(SnappyCodec.class, new Configuration());
    InputStream inStream = codec.createInputStream(new BufferedInputStream(inContent));
    InputStreamReader  inRead = new InputStreamReader(inStream);
    BufferedReader br = new BufferedReader(inRead);
    String line=null;
    while ((line = br.readLine()) != null){
        system.out.println(line);
    }