从Amazon Glacier恢复数据

时间:2017-10-10 15:27:46

标签: amazon-web-services amazon-s3 amazon-glacier

是否有命令行方式从Glacier恢复数据? 到目前为止,我已经尝试过:

s3cmd restore --recursive s3://mybucketname/folder/

aws s3 ls s3://<bucket_name> | awk '{print $4}' | xargs -L 1 aws s3api restore-object --restore-request Days=<days> --bucket <bucket_name> --key

但是,没有帮助。 PS:我知道我们可以通过控制台做到这一点。

4 个答案:

答案 0 :(得分:0)

不幸的是,这是不可能的。您只能使用Amazon S3访问已归档到Amazon Glacier的对象。

参考:http://docs.aws.amazon.com/AmazonS3/latest/user-guide/restore-archived-objects.html

答案 1 :(得分:0)

以下是使用Java还原对象的示例。通过此链接,您可以使用您选择的语言进行类似的操作。

Restore an Archived Object Using the AWS SDK for Java

答案 2 :(得分:0)

您可以使用较低级别的aws s3api命令:

aws s3api restore-object --request-payer requester \
                         --key path/to/key.blob \
                         --bucket my-bucket \
                         --cli-input-json "$(cat request.json)"

然后在request.json内设置参数,例如:

{
    "RestoreRequest": {
     "Days": 1,
     "GlacierJobParameters": {
         "Tier": "Standard"
     }
    }
}

发起还原请求后,您将必须调用head-object以确定其还原状态:

aws s3api head-object --key path/to/key.blob \
                      --bucket my-bucket \
                      --request-payer requester
{
    "AcceptRanges": "bytes",
    "Restore": "ongoing-request=\"true\"",
    "LastModified": "Thu, 30 May 2019 22:43:48 GMT",
    "ContentLength": 1573320976,
    "ETag": "\"5e9bae0592655103e72d0c026e643184-94\"",
    "ContentType": "application/x-gzip",
    "Metadata": {
        "digest-md5": "7ace7afadfaec591a7dcff2b942df701",
        "import-digests": "md5"
    },
    "StorageClass": "GLACIER",
    "RequestCharged": "requester"
}

Restore包含ongoing-request="false"时,恢复将完成。 S3中的临时副本将持续您在restore命令中指定的持续时间。对于任何已还原的文件,StorageClass始终为GLACIER(或DEEP_ARCHIVE),即使还原完成后也是如此。这是不直观的。

如果您希望将该副本永久地还原到S3中,即将存储类从GLACIER更改为STANDARD,则需要将还原后的副本放入/复制(可能在其自身之上)到新文件。真烦人。

相关:

注意:--request-payer requester是可选的。我在设置中使用了它,但是如果您是存储桶的所有者,则不需要它。

答案 3 :(得分:-1)

您可以使用下面的python代码将冰川数据恢复到s3。

import boto3, botocore
import subprocess, os, shutil, tempfile, argparse, sys, time, codecs
from pprint import pprint

sys.stdout = codecs.getwriter('utf8')(sys.stdout)

parser = argparse.ArgumentParser()
parser.add_argument('--max-rate-mb', action='store', type=int, default=10000, help='The maximum rate in MB/h to restore files at.  Files larger than this will not be restored.')
parser.add_argument('--restore-days', action='store', type=int, default=30, help='How many days restored objects will remain in S3.')
parser.add_argument('--restore-path', action='store', help='The bucket/prefix to restore from')
parser.add_argument('--pretend', action='store_true', help='Do not execute restores')
parser.add_argument('--estimate', action='store_true', help='When pretending, do not check for already-restored files')
args = parser.parse_args()

if not args.restore_path:
    print 'No restore path specified.'
    sys.exit(1)

BUCKET = None
PREFIX = None
if '/' in args.restore_path:
    BUCKET, PREFIX = args.restore_path.split('/',1)
else:
    BUCKET = args.restore_path
    PREFIX = ''

RATE_LIMIT_BYTES = args.max_rate_mb * 1024 * 1024

s3 = boto3.Session(aws_access_key_id='<ACCESS_KEY>', aws_secret_access_key='<SECRET_KEY>').resource('s3')
bucket = s3.Bucket(BUCKET)

totalsize = 0
objects = []

objcount = 0
for objpage in bucket.objects.filter(Prefix=PREFIX).page_size(100).pages():
    for obj in objpage:
        objcount += 1
        print obj
        objects.append(obj)
    print u'Found {} objects.'.format(objcount)
print

objects.sort(key=lambda x: x.size, reverse=True)

objects = filter(lambda x: x.storage_class == 'GLACIER', objects)

if objects:
    obj = objects[0]
    print u'The largest object found is of {} size: {:14,d}  {:1s}  {}'.format(('a restorable' if obj.size <= RATE_LIMIT_BYTES else 'an UNRESTORABLE'), obj.size, obj.storage_class[0], obj.key)
    print

while objects:
    current_set = []
    current_set_total = 0
    unreported_unrestoreable_objects = []
    i = 0
    while i < len(objects):
        obj = objects[i]

        if obj.size > RATE_LIMIT_BYTES:
            unreported_unrestoreable_objects.append(obj)
        elif unreported_unrestoreable_objects:
            # No longer accumulating these.  Print the ones we found.
            print u'Some objects could not be restored due to exceeding the hourly rate limit:'
            for obj in unreported_unrestoreable_objects:
                print u'- {:14,d}  {:1s}  {}'.format(obj.size, obj.storage_class[0], obj.key)
            print

        if current_set_total + obj.size <= RATE_LIMIT_BYTES:
            if not args.pretend or not args.estimate:
                if obj.Object().restore is not None:
                    objects.pop(i)
                    continue
            current_set.append(obj)
            current_set_total += obj.size
            objects.pop(i)
            continue
        i += 1

    for obj in current_set:
        print u'{:14,d}  {:1s}  {}'.format(obj.size, obj.storage_class[0], obj.key)
        #pprint(obj.Object().restore)
        if not args.pretend:
            obj.restore_object(RestoreRequest={'Days': args.restore_days})
        #sys.exit(0)

    print u'{:s} Requested restore of {:d} objects consisting of {:,d} bytes.  {:d} objects remaining.  {:,d} bytes of hourly restore rate wasted'.format(time.strftime('%Y-%m-%d %H:%M:%S'), len(current_set), current_set_total, len(objects), RATE_LIMIT_BYTES - current_set_total)
    print
    if not objects:
        break
    if not args.pretend:
        time.sleep(3690)

运行脚本的命令:

python restore_glacier_data_to_s3.py --restore-path s3-bucket-name/folder-name/