我正在使用AWS ES(托管服务)。 AWS确实每天都进行自动备份。我想实现类似但更频繁的东西。
为了实现这一目的,我创建了一个S3存储桶,将其注册为ES集群中的存储库,并编写了一个调度程序,以便在指定时间拍摄集群的快照。
ES快照本质上是增量的,即所有现有快照都加载到内存中以确定要保存在当前快照中的更改。
随着时间的推移,快照数量会增加。
我想保留特定数量的快照并删除其他快照。为此,我们可以编写另一个调度程序。
但是,在我们的快照创建计划程序运行之前,剩下的那些将不足以恢复整个集群。
有没有一种好方法可以解决这个问题?
请建议。
答案 0 :(得分:2)
第一季度::-如何设置Amazon Elasticsearch Service手册索引快照。
https://github.com/miztiik/AWS-Demos/tree/master/How-To/setup-manual-elasticsearch-snapshots
S3-Bucket-Name = xxxxxxx-es-snapshot-repo
ES-IAM-Role = xxxxxxx-es-snapshot-role
ES-REPO-NAME= xxxxxxx-es-snapshot-repository
ES-IAM-USER = xxxxxxx-es-snapshot-user
ES-IAM-Policy = xxxxxxx-es-snapshot-access
ES-POLICY= xxxxxxx-es-allow-role
ES-DOMAIN-NAME = xxxxxxx-waf-logs
ES-END-POINT = https://search-xxxxxxx-waf-logs efsphsb67nsvddjxxxxxxxxx.us-east-1.es.amazonaws.com
================================================ =========================
==>快照是群集数据和状态的备份。状态包括群集设置,节点信息,索引设置和分片分配。 Elasticsearch快照是增量快照,这意味着它们仅存储自上次成功快照以来已更改的数据。这种增量性质意味着,频繁快照和不频繁快照之间的磁盘使用差异通常很小。
==>快照提供了一种方便的方法来跨Amazon Elasticsearch Service域迁移数据并从故障中恢复。自动快照是给定域中的只读快照。您不能使用自动快照迁移到新域。对于迁移,您必须使用手动快照。
================================================ =========================
先决条件
================================================ =========================
ElasticSearch域ES-DOMAIN-NAME = xxxxxxx-waf-logs
创建S3存储桶-xxxxxxx-es-snapshot-repo
获取存储桶ARN-arn:aws:s3 ::: xxxxxxx-es-snapshot-repo
IAM角色:xxxxxxx-es-snapshot-role-
注意::附加以下权限,请确保更改存储桶ARN
================================================ =========================
{
"Version": "2012-10-17",
"Statement": [{
"Action": [
"s3:ListBucket"
],
"Effect": "Allow",
"Resource": [
"arn:aws:s3:::xxxxxxx-es-snapshot-repo"
]
},
{
"Action": [
"s3:GetObject",
"s3:PutObject",
"s3:DeleteObject"
],
"Effect": "Allow",
"Resource": [
"arn:aws:s3:::xxxxxxx-es-snapshot-repo/*"
]
}
]
}
================================================ =========================
将以下信任关系分配给角色
================================================ =========================
{
"Version": "2012-10-17",
"Statement": [{
"Sid": "",
"Effect": "Allow",
"Principal": {
"Service": "es.amazonaws.com"
},
"Action": "sts:AssumeRole"
}]
}
================================================ =========================
================================================ =========================
为用户添加以下策略
================================================ =========================
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": "iam:PassRole",
"Resource": "arn:aws:iam::467657035428:role/xxxxxxx-es-snapshot-role"
},
{
"Effect": "Allow",
"Action": "es:ESHttpPut",
"Resource": "arn:aws:es:us-east-1:467657035428:domain/xxxxxxx-waf-logs/*"
}
]
}
================================================ =========================
使用IAM用户xxxxxxx-es-snapshot-user配置的AWS CLI
================================================ =========================
注册手动快照存储库
================================================ =========================
==>必须先在Amazon Elasticsearch Service中注册快照存储库,然后才能进行手动索引快照。如果您的ES域位于VPC内, 您的计算机必须连接到VPC才能成功注册快照存储库
================================================ =========================
准备EC2客户端以注册我们的S3存储库
================================================ =========================
注意:在下面的代码中更改主机,区域和ROLE ARN以适合您的环境。
================================================ =========================
安装一些必备软件包
================================================ =========================
yum -y install python-pip
pip install requests-aws4auth
================================================ =========================
创建python文件以注册存储库
================================================ =========================
cat >/tmp/register-repo.py <<"EOF"
import boto3
import requests
from requests_aws4auth import AWS4Auth
host = 'https://search-xxxxxxx-waf-logs-efsphsb67nsvddjxxxxxxxxx.us-east-1.es.amazonaws.com/'
region = 'us-east-1' # For example, us-west-1
service = 'es'
credentials = boto3.Session().get_credentials()
awsauth = AWS4Auth(credentials.access_key, credentials.secret_key, region, service, session_token=credentials.token)
# Register repository
path = '_snapshot/xxxxxxx-es-snapshot-repository' # the Elasticsearch API endpoint
url = host + path
payload = {
"type": "s3",
"settings": {
"bucket": "xxxxxxx-es-snapshot-repo",
"region": "us-east-1",
"role_arn": "arn:aws:iam::467657035428:role/xxxxxxx-es-snapshot-role"
}
}
headers = {"Content-Type": "application/json"}
r = requests.put(url, auth=awsauth, json=payload, headers=headers)
print(r.status_code)
print(r.text)
EOF
================================================ =========================
执行文件以注册回购
================================================ =========================
chmod 700 /tmp/register-repo.py
================================================ =========================
python /tmp/register-repo.py
200
{"acknowledged":true}
================================================ =========================
拍摄手动快照
================================================ =========================
创建快照时,您需要指定两条信息:
快照存储库的名称-例如:xxxxxxx-es-snapshot-repository 快照的名称-例如:2019-02-01
================================================ =========================
注意:快照不是瞬时的;快照不是瞬时的。他们需要一些时间才能完成。
================================================ =========================
curl -XPUT 'search-xxxxxxx-waf-logs-efsphsb67nsvddjxxxxxxxxx.us-east-1.es.amazonaws.com/_snapshot/xxxxxxx-es-snapshot-repository/2019-02-28'
================================================ =========================
使用以下命令来验证您的域的快照状态:
================================================ =========================
curl -XGET 'https://search-xxxxxxx-waf-logs-efsphsb67nsvddjxxxxxxxxx.us-east-1.es.amazonaws.com/_snapshot/xxxxxxx-es-snapshot-repository/_all?pretty'
输出:-
{
"snapshots" : [ {
"snapshot" : "snapshot-name",
"uuid" : "FciYMhzFR1iLs0I0Nb1YeA",
"version_id" : 6040299,
"version" : "6.4.2",
"indices" : [ "logs", "awswaf-2019-02-13", "logstash-2019.02.06", "filebeat-6.6.0-2019.02.19", "logstash-2019.02.13", "awswaf-2019-02-21", "awswaf-2019-02-24", "logstash-2019.02.15", "logs-2019-02-15", "filebeat-6.6.0-2019.02.26", "logstash-2019.02.21", "logs-2019-02-13", "awswaf-2019-02-01", "logstash-2019.02.20", "awswaf-2019-02-07", "awswaf-2019-02-26"],
"include_global_state" : true,
"state" : "SUCCESS",
"start_time" : "2019-02-26T13:31:59.721Z",
"start_time_in_millis" : 1551187919721,
"end_time" : "2019-02-26T16:24:48.806Z",
"end_time_in_millis" : 1551198288806,
"duration_in_millis" : 10369085,
"failures" : [ ],
"shards" : {
"total" : 330,
"failed" : 0,
"successful" : 330
}
}, {
"snapshot" : "2019-02-01",
"uuid" : "pHwGshbJRGO-C47uCuuFDw",
"version_id" : 6040299,
"version" : "6.4.2",
"indices" : [ "logs", "awswaf-2019-02-13", "logstash-2019.02.06", "filebeat-6.6.0-2019.02.19", "logstash-2019.02.13", "awswaf-2019-02-21", "awswaf-2019-02-24", "logstash-2019.02.15", "logs-2019-02-15", "filebeat-6.6.0-2019.02.26", "logstash-2019.02.21", "logs-2019-02-13", "awswaf-2019-02-01", "filebeat-6.6.0-2019.02.27", "logstash-2019.02.20", "awswaf-2019-02-07", "awswaf-2019-02-26", "awswaf-2019-02-10", "kibana_sample_data_flights"],
"include_global_state" : true,
"state" : "IN_PROGRESS",
"start_time" : "2019-02-27T06:51:30.836Z",
"start_time_in_millis" : 1551250290836,
"end_time" : "1970-01-01T00:00:00.000Z",
"end_time_in_millis" : 0,
"duration_in_millis" : -1551250290836,
"failures" : [ ],
"shards" : {
"total" : 0,
"failed" : 0,
"successful" : 0
}
} ]
答案 1 :(得分:0)
拍摄快照时,您可以删除一次索引。您始终可以使用&#39; _restore&#39;。
恢复每个索引查看以下链接,了解如何快照,恢复和删除索引。
http://www.datawrangler.in/2017/12/es-index-s3-snapshot-restoration.html