我编写了一个脚本来将s3对象从我的生产s3存储桶复制到我的开发版,但是运行需要很长时间,因为我在复制前单独检查每个对象是否存在。有没有办法区分两个桶,只复制我需要的对象?或者整个桶复制?
以下是我目前的情况:
count = 0
puts "COPYING FROM #{prod_bucket} to #{dev_bucket}"
bm = Benchmark.measure do
AWS::S3.new.buckets[prod_bucket].objects.each do |o|
exists = AWS::S3.new.buckets[dev_bucket].objects[o.key].exists?
if exists
puts "Skipping: #{o.key}"
else
puts "Copy: #{o.key} (#{count})"
o.copy_to(o.key, :bucket_name => dev_bucket, :acl => :public_read)
count += 1
end
end
end
puts "Copied #{count} objects in #{bm.real}s"
答案 0 :(得分:2)
我从来没有使用过那个gem,但你的代码看起来好像可以接收一个存储在桶中的所有项目的数组。加载两个存储桶的列表,并使用简单的数组操作确定丢失的文件。应该快得多。
# load file lists (looks up objects in batches of 1000)
source_files = AWS::S3.new.buckets[prod_bucket].objects.map(&:key)
target_files = AWS::S3.new.buckets[dev_bucket].objects.map(&:key)
# determine files missing in dev
files_to_copy = source_files - target_files
files_to_copy.each_with_index do |file_name, i|
puts "Coping #{i}/#{files_to_copy.size}: #{file_name}"
S3Object.store(file_name,
S3Object.value(file_name, PROD_BUCKET_NAME),
DEV_BUCKET_NAME)
end
# determine files on dev that are not existing on prod
files_to_remove = target_files - source_files
files_to_remove.each_with_index do |file_name, i|
puts "Removing #{i}/#{files_to_remove.size}: #{file_name}"
S3Object.delete(file_name, DEV_BUCKET_NAME)
end