我正在使用CloudFront分发在AWS S3上托管一个静态站点(纯html / css)。我只配置CloudFront将HTTP重定向到HTTPS没问题。仅使S3将www重定向到非www(裸)子域也没有问题。
当我尝试将所有HTTP流量重定向到HTTPS ,同时将所有www子域重定向到非www时,就会出现问题。
它根本不起作用。而且我一直无法找到解决这个问题的方法,而且我已经寻找了几个月。看起来StackOverflow有答案,但我告诉您没有答案。他们的解决方案要么走到了尽头,要么该解决方案是针对与今天的方式不太匹配的旧版AWS用户界面的。
我能想到的最好的办法是将www重定向到非www的HTML重定向,但是从SEO和可维护性的角度来看,这并不理想。
此配置的最佳解决方案是什么?
答案 0 :(得分:1)
正如我在Supporting HTTPS URL redirection with a single CloudFront distribution中提到的那样,简单明了的解决方案涉及两个存储桶和两个CloudFront发行版-一个用于www,另一个用于裸域。我非常怀疑这会对SEO产生负面影响。
但是,该答案早于CloudFront Lambda@Edge扩展的引入,该扩展提供了另一种解决方案,因为它允许您触发Javascript Lambda函数以在CloudFront的请求处理过程中的特定点运行,以检查请求并可能对其进行修改或对其做出反应。
文档中有几个examples,但是它们都非常简约,因此,这是一个完整的,有效的示例,带有比实际代码更多的注释,准确说明了它的作用和作用。
此功能(配置为“原始请求”触发器)将在每次出现高速缓存未命中时触发,并检查浏览器发送的Host
标头,以查看是否应允许该请求通过,或是否应该重定向它而不实际将请求一直发送到S3。对于缓存命中,该功能将不会触发,因为CloudFront已经缓存了内容。
与CloudFront发行版关联的任何其他域名都将重定向到您的站点的“真实”域名,如功能主体中所配置。 (可选)如果有人直接访问您的发行版的*.cloudfront.net
默认主机名,它也会返回生成的404响应。
您可能想知道单个CloudFront分布的缓存如何区分example.com/some-path
和www.example.com/some-path
的内容并将它们分别缓存,但是答案是它可以并且确实做到了如果,则针对此设置对其进行适当配置-这意味着将其告知cache based on selected request headers-特别是Host
标头。
通常情况下,启用该配置与S3不太兼容,但是在这里可以使用,因为Lambda函数还将Host标头设置回S3期望的值。请注意,您需要在代码中内联配置源域名(存储桶的网站托管端点)。
使用此配置,您只需要一个存储桶,并且该存储桶的名称不需要与任何域名匹配。您可以使用任何需要的存储桶...,但是您确实需要使用存储桶的网站托管端点,以便CloudFront将其视为自定义来源。使用REST端点为存储桶创建“ S3源”将无法正常工作。
'use strict';
// if an incoming request is for a domain name other than the canonical
// (official) hostname for the site, this Lambda@Edge trigger
// will redirect the request back to the official site, subject to the
// configuration parameters below.
// this trigger must be deployed as an Origin Request trigger.
// in the CloudFront Cache Behavior settings, the Host header must be
// whitelisted for forwarding, in order for this function to work as intended;
// this is an artifact of the way the Lambda@Edge interface interacts with the
// CloudFront cache key mechanism -- we can't react to what we can't see,
// and if it isn't part of the cache key, CloudFront won't expose it.
// specify the official hostname of the site; requests to this domain will
// be passed through; others will redirect to it...
const canonical_domain_name = 'example.com';
// ...but note that every CloudFront distribution has a default *.cloudfront.net
// hostname that can't be disabled; you may not want this hostname to do
// anything at all, including redirect; set this parameter to true if you
// want to to return 404 for the default hostname; see the render_reject()
// function to customize the behavior further.
const reject_default_hostname = false;
// the "origin" is the server that provides your content; this is configured
// in the distribution and selected in the Cache Behavior settings, but
// that information needs to be provided here, so that we can modify
// successful requests to match what the destination expects.
const origin_domain_name = 'example-bucket.s3-website.us-east-2.amazonaws.com';
// http status code for redirects; you may want 302 or 307 for testing,
// and 301 or 308 for production; note that this is a string, not a number.
const redirect_http_status_code = '302';
// for generated redirects, we can also set a cache control header; you'll need
// to ensure you format this correctly, since the code below does not validate
// the syntax; here, max-age is how long the browser should cache redirects,
// while s-maxage tells CloudFront how long to potentially cache them;
// higher values should result in less traffic and potentially lower costs;
// set to empty string or null if you don't want to set a value.
const redirect_cache_control = 'max-age=300, s-maxage=86400';
// set false to drop the query string on redirects; true to preserve
const redirect_preserve_querystring = true;
// set false to change the path to '/' on redirects; true to preserve
const redirect_preserve_path = true;
// end of configuration
// the URL in the generated redirect will always use https unless you
// configure whitelisting of CloudFront-Forwarded-Proto, in which case we
// will use that value; if you want to send http to https, use the
// Viewer Protocol Policy settings in the CloudFront cache behavior.
exports.handler = (event, context, callback) => {
// extract the CloudFront object from the trigger event
const cf = event.Records[0].cf;
// extract the request object
const request = cf.request;
// extract the HTTP Host header
const host = request.headers.host[0].value;
// check whether the host header matches the canonical value; if so,
// set the host header to what the origin expects, and return control
// to CloudFront
if(host === canonical_domain_name)
{
request.headers.host[0].value = origin_domain_name;
return callback(null, request);
}
// check for rejection
if (reject_default_hostname && host.endsWith('.cloudfront.net'))
{
return render_reject(cf, callback);
}
// if neither 'return' above has been invoked, then we need to generate a redirect.
const proto = (request.headers['cloudfront-forwarded-proto'] || [{ value: 'https' }])[0].value;
const path = redirect_preserve_path ? request.uri : '/';
const query = redirect_preserve_querystring && (request.querystring != '') ? ('?' + request.querystring) : '';
const location = proto + '://' + canonical_domain_name + path + query;
// build a response object to redirect the browser.
const response = {
status: redirect_http_status_code,
headers: {
'location': [ { key: 'Location', value: location } ],
},
body: '',
};
// add the cache control header, if configured
if(redirect_cache_control)
{
response.headers['cache-control'] = [{ key: 'Cache-Control', value: redirect_cache_control }];
}
// return the response object, preventing the request from being sent to
// the origin server
return callback(null, response);
};
function render_reject(cf, callback) {
// only invoked if the request is for *.cloudfront.net and you set
// reject_default_hostname to true; here, we generate a very simple
// response, text/plain, with a 404 error. This can be customized to HTML
// or XML, etc., according to your local practices, but be sure you properly
// escape the request URI, since it is untrusted data and could lead to an
// XSS injection otherwise; no similar vulnerability exists with plain text.
const body_text = `The requested URL '${cf.request.uri}' does not exist ` +
'on this server, or access is not enabled via the ' +
`${ cf.request.headers.host[0].value } endpoint.\r\n`;
// generate a response; you may want to customize this; note that
// Lambda@Edge is strict with regard to the way headers are specified;
// the outer keys are lowercase, the inner keys can be mixed.
const response = {
status: '404',
headers: {
'cache-control': [{ key: 'Cache-Control', value: 'no-cache, s-maxage=86400' }],
'content-type': [{ key: 'Content-Type', value: 'text/plain' }],
},
body: body_text,
};
return callback(null, response);
}
// eof
答案 1 :(得分:1)
使用Lambda @ Edge完成other answer here后,我意识到有一个非常简单的解决方案,仅使用一个CloudFront发行版和三个(下面说明)S3存储桶即可。
此解决方案的约束更多,但它的活动部件更少,实现和使用的成本也更低。
以下是约束条件:
example.com
的存储桶和一个名为www.example.com
的存储桶。dzczcexample.cloudfront.net
,并且此存储桶还必须与其他两个存储桶位于同一区域。使用其网站托管端点将CloudFront发行版的原始域名配置为指向您的主要内容存储桶,例如example.com.s3-website.us-east-2.amazonaws.com
。
为example.com
和www.example.com
配置备用域名设置。
Whitelist Host
标头,用于转发到源。此设置利用以下事实:当S3无法将传入的HTTP Host
标头识别为属于S3的标头时,...
请求的存储区是Host标头的小写值,请求的键是Request-URI。
https://docs.aws.amazon.com/AmazonS3/latest/dev/VirtualHosting.html
嗯...完美!这正是我们所需要的-它提供了一种方法,可以根据浏览器的要求,通过单个CloudFront分发将请求传递到一个S3区域中的多个存储桶,因为有了此设置,我们能够分裂逻辑:
Host
标头用于选择哪个存储桶处理请求。(这就是为什么所有存储桶都必须位于同一区域的原因,如上所述。否则,请求将被传递到“主”存储桶的区域,并且如果识别出该请求,则该区域将拒绝该请求,因为路由错误桶在其他区域。)
有了此配置,您将发现example.com
请求由example.com
存储桶处理,www.example.com
请求由www.example.com
存储桶处理,意味着您现在要做的就是根据需要配置存储桶。
但是还有一个关键的步骤。您绝对需要创建一个以您的CloudFront分配的默认域名(例如d111jozxyqk.cloudfront.net
)命名的存储桶,以避免设置可利用的场景。这不是安全漏洞,而是一个计费漏洞。如何配置此存储桶并没有多大区别,但是拥有存储桶以便其他任何人都无法创建它很重要。为什么?因为使用此配置,直接发送到您的CloudFront分配的默认域名(而不是您的自定义域)的请求将导致S3返回该存储桶名称的No Such Bucket
错误 。如果有人发现您的设置,他们可以创建该存储桶,您将通过CloudFront发行版支付他们的所有数据流量。创建存储桶,然后将其保留为空(以便返回错误),或者将其设置为重定向到您的主网站。