问题的执行摘要。我有一个存储桶,我们将其称为存储桶A,该存储桶在一个帐户中设置为默认的客户KMS密钥(将称为ID:1111111),我们将其称为123。在该存储桶中有两个对象,它们都在同一对象下该存储桶中的路径。它们具有相同的KMS密钥ID和相同的所有者。当我尝试将这些帐户同步到另一个帐户中的新存储桶B时,让我们使用帐户456,一个已成功同步,但另一个未成功,而是得到:
An error occurred (AccessDenied) when calling the CopyObject operation: Access Denied
有人以前看到过这种不一致的行为吗?我之所以说不一致,是因为这两者之间的访问权限绝对没有区别,但是一个成功,另一个却没有。注意:为简单起见,我的摘要中陈述了两个对象,但我的实际案例中有30个对象,其中2个正在复制,其余对象失败,并且在某些其他路径内,混合结果不同。
以下内容描述了情况-为了安全起见,混淆了一些数据,但采用了一致的方式:
桶A(com.mycompany.datalake.us-east-1)桶策略:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "AllowAccess",
"Effect": "Allow",
"Principal": {
"AWS": [
"arn:aws:iam::123:root",
"arn:aws:iam::456:root"
]
},
"Action": [
"s3:PutObjectTagging",
"s3:PutObjectAcl",
"s3:PutObject",
"s3:ListBucket",
"s3:GetObject"
],
"Resource": [
"arn:aws:s3:::com.mycompany.datalake.us-east-1/security=0/*",
"arn:aws:s3:::com.mycompany.datalake.us-east-1"
]
},
{
"Sid": "DenyIfNotGrantingFullAccess",
"Effect": "Deny",
"Principal": {
"AWS": [
"arn:aws:iam::123:root",
"arn:aws:iam::456:root"
]
},
"Action": "s3:PutObject",
"Resource": [
"arn:aws:s3:::com.mycompany.datalake.us-east-1/security=0/*",
"arn:aws:s3:::com.mycompany.datalake.us-east-1"
],
"Condition": {
"StringNotLike": {
"s3:x-amz-acl": "bucket-owner-full-control"
}
}
},
{
"Sid": "DenyIfNotUsingExpectedKmsKey",
"Effect": "Deny",
"Principal": {
"AWS": [
"arn:aws:iam::123:root",
"arn:aws:iam::456:root"
]
},
"Action": "s3:PutObject",
"Resource": [
"arn:aws:s3:::com.mycompany.datalake.us-east-1/security=0/*",
"arn:aws:s3:::com.mycompany.datalake.us-east-1"
],
"Condition": {
"StringNotLike": {
"s3:x-amz-server-side-encryption-aws-kms-key-id": "arn:aws:kms:us-east-1:123:key/1111111"
}
}
}
]
}
我还在源帐户中创建了一个假定角色,我称之为datalake_full_access_role
:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:ListBucket",
"s3:GetObject"
],
"Resource": [
"arn:aws:s3:::com.mycompany.datalake.us-east-1/security=0/*",
"arn:aws:s3:::com.mycompany.datalake.us-east-1"
]
}
]
}
与帐户456的信任关系。值得一提的是,当前KMS密钥1111111的策略是完全开放的:
{
"Version": "2012-10-17",
"Id": "key-default-1",
"Statement": [
{
"Sid": "Enable IAM User Permissions",
"Effect": "Allow",
"Principal": {
"AWS": "*"
},
"Action": "kms:*",
"Resource": "*"
},
{
"Effect": "Allow",
"Principal": {
"AWS": "*"
},
"Action": [
"kms:Encrypt*",
"kms:Decrypt*",
"kms:ReEncrypt*",
"kms:GenerateDataKey*",
"kms:Describe*"
],
"Resource": "*"
}
]
}
现在,为帐户456中的目标存储桶B(mycompany-us-west-2-datalake),存储桶策略:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "AccountBasedAccess",
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::456:root"
},
"Action": "s3:*",
"Resource": [
"arn:aws:s3:::mycompany-us-west-2-datalake",
"arn:aws:s3:::mycompany-us-west-2-datalake/*"
]
}
]
}
要进行迁移(同步),我在456帐户中设置了一个EC2实例,并为其附加了一个实例配置文件,该实例配置文件具有以下策略:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": "sts:AssumeRole",
"Resource": "arn:aws:iam::123:role/datalake_full_access_role"
}
]
}
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"kms:DescribeKey",
"kms:ReEncrypt*",
"kms:CreateGrant",
"kms:Decrypt"
],
"Resource": [
"arn:aws:kms:us-east-1:123:key/1111111"
]
}
]
}
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:ListBucket"
],
"Resource": [
"arn:aws:s3:::com.mycompany.datalake.us-east-1",
"arn:aws:s3:::com.mycompany.datalake.us-east-1/security=0/*"
]
}
]
}
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": "s3:*",
"Resource": [
"arn:aws:s3:::mycompany-us-west-2-datalake",
"arn:aws:s3:::mycompany-us-west-2-datalake/*"
]
}
]
}
现在在EC2实例上,我安装了最新的AWS版本:
$ aws --version
aws-cli/1.16.297 Python/3.5.2 Linux/4.4.0-1098-aws botocore/1.13.33
,然后运行我的同步命令:
aws s3 sync s3://com.mycompany.datalake.us-east-1 s3://mycompany-us-west-2-datalake --source-region us-east-1 --region us-west-2 --acl bucket-owner-full-control --exclude '*' --include '*/zone=raw/Event/*' --no-progress
我相信我已经完成了家庭作业,并且所有这些都应该工作,但对于某些对象却可以,但是并非全部,并且我现在没有其他尝试。请注意,通过以下两个调用,我已100%成功地同步到EC2实例上的本地目录,然后从本地目录同步到新存储桶:
aws s3 sync s3://com.mycompany.datalake.us-east-1 datalake --source-region us-east-1 --exclude '*' --include '*/zone=raw/Event/*' --no-progress
aws s3 sync datalake s3://mycompany-us-west-2-datalake --region us-west-2 --acl bucket-owner-full-control --exclude '*' --include '*/zone=raw/Event/*' --no-progress
这绝对没有意义,因为从访问POV来看没有区别。下面是对源存储桶中两个对象的属性的研究,一个成功,另一个失败:
成功的对象:
Owner
Dev.Awsmaster
Last modified
Jan 12, 2019 10:11:48 AM GMT-0800
Etag
12ab34
Storage class
Standard
Server-side encryption
AWS-KMS
KMS key ID
arn:aws:kms:us-east-1:123:key/1111111
Size
9.2 MB
Key
security=0/zone=raw/Event/11_96152d009794494efeeae49ed10da653.avro
失败的对象:
Owner
Dev.Awsmaster
Last modified
Jan 12, 2019 10:05:26 AM GMT-0800
Etag
45cd67
Storage class
Standard
Server-side encryption
AWS-KMS
KMS key ID
arn:aws:kms:us-east-1:123:key/1111111
Size
3.2 KB
Key
security=0/zone=raw/Event/05_6913583e47f457e9e25e9ea05cc9c7bb.avro
附录:在研究了几种情况之后,我开始看到一种模式。我认为对象太小可能会出现问题。在分析的10个目录中,有10个成功同步了一些对象,但并非所有对象都成功同步,所有成功的大小都在8MB或更大,而所有失败的大小都在8MB以下。当KMS出现问题时,这可能是aws s3 sync
的错误吗?我想知道我是否可以调整~/.aws/config
以便解决这个问题?
答案 0 :(得分:1)
我找到了解决方案;虽然,我仍然认为这是AWS s3 sync的错误。通过在~./aws/config
中设置以下内容,即可成功同步所有对象:
[default]
output = json
s3 =
signature_version = s3v4
multipart_threshold = 1
我以前使用过的signature_version
,但想想如果有人有类似的需求,我会为了完整性而提供它。新条目为multipart_threshold = 1
,这意味着任何大小的对象都将触发分段上传。我没有指定multipart_chunksize
,根据文档它默认为5MB。
老实说,此要求没有意义,因为该对象是否以前使用多部分上传到S3都无关紧要,而且我知道当不涉及KMS时这无关紧要,但显然在什么时候该紧要是的。