如何更新一批S3对象'使用ruby的元数据?

时间:2012-02-14 16:53:36

标签: ruby amazon-s3 fog

我需要在S3上的数百或数千个对象上更改一些元数据(Content-Type)。用红宝石做这件事的好方法是什么?据我所知,无法仅使用fog.io保存元数据,必须重新保存整个对象。似乎使用the official sdk library会要求我为这一项任务滚动一个包装器环境。

5 个答案:

答案 0 :(得分:7)

您说得对,官方SDK允许您修改对象元数据而无需再次上传。它的作用是copy the object,但它在服务器上,因此您无需下载文件并重新上传。

包装器很容易实现,比如

bucket.objects.each do |object|
  object.metadata['content-type'] = 'application/json'
end

答案 1 :(得分:5)

在v2 API中,您可以使用Object#copy_from()Object.copy_to():metadata:metadata_directive => 'REPLACE'选项更新对象的元数据,而无需从S3下载

Joost's gist中的代码会引发此错误:

  

Aws :: S3 :: Errors :: InvalidRequest:此复制请求是非法的,因为   它试图在不改变对象的情况下将对象复制到自身   元数据,存储类,网站重定向位置或加密   属性。

这是因为默认情况下AWS会忽略复制操作提供的:metadata,因为它会复制元数据。如果我们想要就地更新元数据,我们必须设置:metadata_directive => 'REPLACE'选项。

请参阅http://docs.aws.amazon.com/sdkforruby/api/Aws/S3/Object.html#copy_from-instance_method

这是我最近用于执行元数据更新操作的完整工作代码段:

require 'aws-sdk'

# S3 setup boilerplate
client = Aws::S3::Client.new(
  :region => 'us-east-1',
  :access_key_id => ENV['AWS_ACCESS_KEY'],
  :secret_access_key => ENV['AWS_SECRET_KEY'], 
)
s3 = Aws::S3::Resource.new(:client => client)

# Get an object reference
object = s3.bucket('my-bucket-name').object('my-object/key')

# Create our new metadata hash. This can be any hash; in this example we update
# existing metadata with a new key-value pair.
new_metadata = object.metadata.merge('MY_NEW_KEY' => 'MY_NEW_VALUE')

# Use the copy operation to replace our metadata
object.copy_to(object,
  :metadata => new_metadata,

  # IMPORTANT: normally S3 copies the metadata along with the object.
  # we must supply this directive to replace the existing metadata with
  # the values we supply
  :metadata_directive => "REPLACE",
)

易于重复使用:

def update_metadata(s3_object, new_metadata = {})
  s3_object.copy_to(s3_object,
    :metadata => new_metadata
    :metadata_directive => "REPLACE"
  )
end

答案 2 :(得分:4)

对于未来的读者,这里有一个使用Ruby aws-sdk v1更改内容的完整示例(另请参阅此Gist了解aws-sdk v2示例):

# Using v1 of Ruby aws-sdk as currently v2 seems not able to do this (broken?).
require 'aws-sdk-v1'

key = YOUR_AWS_KEY
secret = YOUR_AWS_SECRET
region = YOUR_AWS_REGION

AWS.config(access_key_id: key, secret_access_key: secret, region: region)
s3 = AWS::S3.new
bucket = s3.buckets[bucket_name]
bucket.objects.with_prefix('images/').each do |obj|
  puts obj.key
  # Add  metadata: {} to next line for more metadata.
  obj.copy_from(obj.key, content_type: obj.content_type, cache_control: 'max-age=1576800000',  acl: :public_read)
end

答案 3 :(得分:2)

经过一些搜索,这似乎对我有用

obj.copy_to(obj, :metadata_directive=>"REPLACE", :acl=>"public-read",:content_type=>"text/plain")

答案 4 :(得分:1)

使用sdk更改内容类型将产生x-amz-meta- 前缀。我的解决方案是使用ruby + aws cli。这将直接写入content-type而不是x-amz-meta-content-type

ids_to_copy = all_object_ids
ids_to_copy.each do |id|
    object_key = "#{id}.pdf"
    command = "aws s3 cp s3://{bucket-name}/#{object_key} s3://{bucket-name}/#{object_key} --no-guess-mime-type --content-type='application/pdf' --metadata-directive='REPLACE'"
    system(command)
end