如何为Aws :: S3 :: Object创建IO对象以逐行流式传输d / l?

时间:2016-05-25 22:16:19

标签: ruby amazon-s3 aws-sdk

Ruby新手在这里。我想流式传输一个大的S3对象(文本文件)并一次处理它。我不想在本地存储对象数据或将其完全加载到内存中。 Aws::S3::Object#get()采用哈希,其response_target参数可以采用IO对象,但我不确定如何子类化或实例化IO实例来执行我​​想要的操作。

我想最终得到类似的东西:

line_reader = nil # TODO make my IO instance

s3_obj = s3.bucket('bucket-name').object('key')
s3_obj.get({response_target: line_reader}) # returns as soon as streaming begins?

# line_reader does not accumulate response data in memory
line_reader.each do | text_line | 
  # do processing on each line independently
end

谢谢!

1 个答案:

答案 0 :(得分:0)

根据streaming_data_to_a_block,您可以从s3对象流式传输数据,如下所示:

s3.get_object(bucket: 'bucket-name', key: 'object-key') do |chunk|
  # process chunk of data
end