Elasticsearch Snapshot&在AWS中还原

时间:2017-07-04 13:42:07

标签: amazon-web-services elasticsearch

我正在使用AWS ES(托管服务)。 AWS确实每天都进行自动备份。我想实现类似但更频繁的东西。

为了实现这一目的,我创建了一个S3存储桶,将其注册为ES集群中的存储库,并编写了一个调度程序,以便在指定时间拍摄集群的快照。

ES快照本质上是增量的,即所有现有快照都加载到内存中以确定要保存在当前快照中的更改。

随着时间的推移,快照数量会增加。

我想保留特定数量的快照并删除其他快照。为此,我们可以编写另一个调度程序。

但是,在我们的快照创建计划程序运行之前,剩下的那些将不足以恢复整个集群。

有没有一种好方法可以解决这个问题?

请建议。

2 个答案:

答案 0 :(得分:2)

第一季度::-如何设置Amazon Elasticsearch Service手册索引快照。

https://github.com/miztiik/AWS-Demos/tree/master/How-To/setup-manual-elasticsearch-snapshots

S3-Bucket-Name =    xxxxxxx-es-snapshot-repo
ES-IAM-Role =       xxxxxxx-es-snapshot-role
ES-REPO-NAME=       xxxxxxx-es-snapshot-repository
ES-IAM-USER =       xxxxxxx-es-snapshot-user
ES-IAM-Policy =     xxxxxxx-es-snapshot-access
ES-POLICY=          xxxxxxx-es-allow-role
ES-DOMAIN-NAME =    xxxxxxx-waf-logs
ES-END-POINT =      https://search-xxxxxxx-waf-logs efsphsb67nsvddjxxxxxxxxx.us-east-1.es.amazonaws.com

================================================ =========================

==>快照是群集数据和状态的备份。状态包括群集设置,节点信息,索引设置和分片分配。 Elasticsearch快照是增量快照,这意味着它们仅存储自上次成功快照以来已更改的数据。这种增量性质意味着,频繁快照和不频繁快照之间的磁盘使用差异通常很小。

==>快照提供了一种方便的方法来跨Amazon Elasticsearch Service域迁移数据并从故障中恢复。自动快照是给定域中的只读快照。您不能使用自动快照迁移到新域。对于迁移,您必须使用手动快照。

================================================ =========================

先决条件

================================================ =========================

  1. ElasticSearch域ES-DOMAIN-NAME = xxxxxxx-waf-logs

  2. 创建S3存储桶-xxxxxxx-es-snapshot-repo

  3. 获取存储桶ARN-arn:aws:s3 ::: xxxxxxx-es-snapshot-repo

  4. IAM角色:xxxxxxx-es-snapshot-role-

注意::附加以下权限,请确保更改存储桶ARN

================================================ =========================

 {
   "Version": "2012-10-17",
   "Statement": [{
       "Action": [
         "s3:ListBucket"
       ],
       "Effect": "Allow",
       "Resource": [
         "arn:aws:s3:::xxxxxxx-es-snapshot-repo"
       ]
     },
     {
       "Action": [
         "s3:GetObject",
         "s3:PutObject",
         "s3:DeleteObject"
       ],
       "Effect": "Allow",
       "Resource": [
         "arn:aws:s3:::xxxxxxx-es-snapshot-repo/*"
       ]
     }
   ]
 }

================================================ =========================

将以下信任关系分配给角色

================================================ =========================

 {
   "Version": "2012-10-17",
   "Statement": [{
     "Sid": "",
     "Effect": "Allow",
     "Principal": {
       "Service": "es.amazonaws.com"
     },
     "Action": "sts:AssumeRole"
   }]
 }

================================================ =========================

  1. 使用AWS CLI创建IAM用户-xxxxxxx-es-snapshot-user在此处获取帮助

================================================ =========================

为用户添加以下策略

================================================ =========================

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": "iam:PassRole",
            "Resource": "arn:aws:iam::467657035428:role/xxxxxxx-es-snapshot-role"
        },
        {
            "Effect": "Allow",
            "Action": "es:ESHttpPut",
            "Resource": "arn:aws:es:us-east-1:467657035428:domain/xxxxxxx-waf-logs/*"
        }
    ]
}

================================================ =========================

6。运行Linux的EC2可以连接到ES群集-只需将其托管在与ES相同的VPC /子网中即可

使用IAM用户xxxxxxx-es-snapshot-user配置的AWS CLI

================================================ =========================

注册手动快照存储库

================================================ =========================

==>必须先在Amazon Elasticsearch Service中注册快照存储库,然后才能进行手动索引快照。如果您的ES域位于VPC内,     您的计算机必须连接到VPC才能成功注册快照存储库

================================================ =========================

准备EC2客户端以注册我们的S3存储库

================================================ =========================

注意:在下面的代码中更改主机,区域和ROLE ARN以适合您的环境。

================================================ =========================

安装一些必备软件包

================================================ =========================

yum -y install python-pip

pip install requests-aws4auth

================================================ =========================

创建python文件以注册存储库

================================================ =========================

cat >/tmp/register-repo.py <<"EOF"
import boto3
import requests
from requests_aws4auth import AWS4Auth

host = 'https://search-xxxxxxx-waf-logs-efsphsb67nsvddjxxxxxxxxx.us-east-1.es.amazonaws.com/'
region = 'us-east-1' # For example, us-west-1
service = 'es'
credentials = boto3.Session().get_credentials()
awsauth = AWS4Auth(credentials.access_key, credentials.secret_key, region, service, session_token=credentials.token)

# Register repository
path = '_snapshot/xxxxxxx-es-snapshot-repository' # the Elasticsearch API endpoint
url = host + path

payload = {
  "type": "s3",
  "settings": {
    "bucket": "xxxxxxx-es-snapshot-repo",
    "region": "us-east-1",
    "role_arn": "arn:aws:iam::467657035428:role/xxxxxxx-es-snapshot-role"
  }
}

headers = {"Content-Type": "application/json"}

r = requests.put(url, auth=awsauth, json=payload, headers=headers)

print(r.status_code)
print(r.text)
EOF

================================================ =========================

执行文件以注册回购

================================================ =========================

chmod 700 /tmp/register-repo.py

================================================ =========================

python /tmp/register-repo.py

200
{"acknowledged":true}

================================================ =========================

拍摄手动快照

================================================ =========================

创建快照时,您需要指定两条信息:

快照存储库的名称-例如:xxxxxxx-es-snapshot-repository 快照的名称-例如:2019-02-01

================================================ =========================

注意:快照不是瞬时的;快照不是瞬时的。他们需要一些时间才能完成。

================================================ =========================

curl -XPUT 'search-xxxxxxx-waf-logs-efsphsb67nsvddjxxxxxxxxx.us-east-1.es.amazonaws.com/_snapshot/xxxxxxx-es-snapshot-repository/2019-02-28'

================================================ =========================

使用以下命令来验证您的域的快照状态:

================================================ =========================

curl -XGET 'https://search-xxxxxxx-waf-logs-efsphsb67nsvddjxxxxxxxxx.us-east-1.es.amazonaws.com/_snapshot/xxxxxxx-es-snapshot-repository/_all?pretty'

输出:-

{
  "snapshots" : [ {
    "snapshot" : "snapshot-name",
    "uuid" : "FciYMhzFR1iLs0I0Nb1YeA",
    "version_id" : 6040299,
    "version" : "6.4.2",
    "indices" : [ "logs", "awswaf-2019-02-13", "logstash-2019.02.06", "filebeat-6.6.0-2019.02.19", "logstash-2019.02.13", "awswaf-2019-02-21", "awswaf-2019-02-24", "logstash-2019.02.15", "logs-2019-02-15", "filebeat-6.6.0-2019.02.26", "logstash-2019.02.21", "logs-2019-02-13", "awswaf-2019-02-01", "logstash-2019.02.20", "awswaf-2019-02-07", "awswaf-2019-02-26"],
    "include_global_state" : true,
    "state" : "SUCCESS",
    "start_time" : "2019-02-26T13:31:59.721Z",
    "start_time_in_millis" : 1551187919721,
    "end_time" : "2019-02-26T16:24:48.806Z",
    "end_time_in_millis" : 1551198288806,
    "duration_in_millis" : 10369085,
    "failures" : [ ],
    "shards" : {
      "total" : 330,
      "failed" : 0,
      "successful" : 330
    }
  }, {
    "snapshot" : "2019-02-01",
    "uuid" : "pHwGshbJRGO-C47uCuuFDw",
    "version_id" : 6040299,
    "version" : "6.4.2",
    "indices" : [ "logs", "awswaf-2019-02-13", "logstash-2019.02.06", "filebeat-6.6.0-2019.02.19", "logstash-2019.02.13", "awswaf-2019-02-21", "awswaf-2019-02-24", "logstash-2019.02.15", "logs-2019-02-15", "filebeat-6.6.0-2019.02.26", "logstash-2019.02.21", "logs-2019-02-13", "awswaf-2019-02-01", "filebeat-6.6.0-2019.02.27", "logstash-2019.02.20", "awswaf-2019-02-07", "awswaf-2019-02-26", "awswaf-2019-02-10", "kibana_sample_data_flights"],
    "include_global_state" : true,
    "state" : "IN_PROGRESS",
    "start_time" : "2019-02-27T06:51:30.836Z",
    "start_time_in_millis" : 1551250290836,
    "end_time" : "1970-01-01T00:00:00.000Z",
    "end_time_in_millis" : 0,
    "duration_in_millis" : -1551250290836,
    "failures" : [ ],
    "shards" : {
      "total" : 0,
      "failed" : 0,
      "successful" : 0
    }
  } ]

答案 1 :(得分:0)

拍摄快照时,您可以删除一次索引。您始终可以使用&#39; _restore&#39;。

恢复每个索引

查看以下链接,了解如何快照,恢复和删除索引。

http://www.datawrangler.in/2017/12/es-index-s3-snapshot-restoration.html