在设置If-Modified-Since: Wed, 15 Feb 2012 07:25:00 CET
的情况下直接向后端发送GET请求时,Apache正确返回304而没有内容。
当我通过Varnish 3.0.2发送相同的请求时,它会响应200并重新发送所有内容,即使客户已经拥有它。显然,这不是一个很好的带宽使用。我的理解是Varnish支持智能处理这个头,应该发送一个304,所以我想我的.vcl文件做错了。
Varnishlog提供了这个:
16 SessionOpen c 84.97.17.233 64416 :80
16 ReqStart c 84.97.17.233 64416 1597323690
16 RxRequest c GET
16 RxURL c /fr/CS/CS_AU-Maboreke-6-6-2004.pdf
16 RxProtocol c HTTP/1.0
16 RxHeader c Host: www.quotaproject.org
16 RxHeader c User-Agent: Sprawk/1.3 (http://www.sprawk.com/)
16 RxHeader c Accept: */*
16 RxHeader c Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
16 RxHeader c Connection: close
16 RxHeader c If-Modified-Since: Wed, 15 Feb 2012 07:25:00 CET
16 VCL_call c recv lookup
16 VCL_call c hash
16 Hash c /fr/CS/CS_AU-Maboreke-6-6-2004.pdf
16 Hash c www.quotaproject.org
16 VCL_return c hash
16 Hit c 1597322756
16 VCL_call c hit
16 VCL_acl c NO_MATCH CTRLF5
16 VCL_return c deliver
16 VCL_call c deliver deliver
16 TxProtocol c HTTP/1.1
16 TxStatus c 200
16 TxResponse c OK
16 TxHeader c Server: Apache
16 TxHeader c Last-Modified: Wed, 09 Jun 2004 16:07:50 GMT
16 TxHeader c Vary: Accept-Encoding
16 TxHeader c Content-Type: application/pdf
16 TxHeader c Date: Wed, 22 Feb 2012 18:25:05 GMT
16 TxHeader c Age: 12432
16 TxHeader c Connection: close
16 Gzip c U D - 107685 115763 80 796748 861415
16 Length c 98304
16 ReqEnd c 1597323690 1329935105.713264704 1329935106.208528996 0.000071526 0.000068426 0.495195866
16 SessionClose c EOF mode
16 StatSess c 84.97.17.233 64416 0 1 1 0 0 0 203 98304
如果我理解正确,该对象已经在Varnish的缓存中,因此它不需要联系后端,但它已经知道Last-Modified
那么为什么它不响应304?
这是我的VCL文件:
backend idea {
# .host = "www.idea.int";
.host = "83.145.60.235"; # IDEA's public website IP
.port = "80";
}
backend qp {
# .host = "www.quotaproject.org";
.host = "83.145.60.235"; # IDEA's public website IP
.port = "80";
}
#
#Below is a commented-out copy of the default VCL logic. If you
#redefine any of these subroutines, the built-in logic will be
#appended to your code.
#
sub vcl_recv {
# force domain so that Apache handles the VH correctly
if (req.http.host ~ "^qp" || req.http.host ~ "quotaproject.org$") {
set req.http.Host = "www.quotaproject.org";
set req.backend = qp;
} else {
# default to idea.int
set req.http.Host = "www.idea.int";
set req.backend = idea;
}
# Before anything else we need to fix gzip compression
if (req.http.Accept-Encoding) {
if (req.url ~ "\.(jpg|png|gif|gz|tgz|bz2|tbz|mp3|ogg)$") {
# No point in compressing these
remove req.http.Accept-Encoding;
} else if (req.http.Accept-Encoding ~ "gzip") {
set req.http.Accept-Encoding = "gzip";
} else if (req.http.Accept-Encoding ~ "deflate") {
set req.http.Accept-Encoding = "deflate";
} else {
# unknown algorithm
remove req.http.Accept-Encoding;
}
}
# ajax requests bypass cache. TODO: Make sure you Javascript implementation for AJAX actually sets XMLHttpRequest
if (req.http.X-Requested-With == "XMLHttpRequest") {
return(pass);
}
if (req.request != "GET" &&
req.request != "HEAD" &&
req.request != "PUT" &&
req.request != "POST" &&
req.request != "TRACE" &&
req.request != "OPTIONS" &&
req.request != "DELETE") {
/* Non-RFC2616 or CONNECT which is weird. */
return (pipe);
}
# Purge everything url - this isn't the squid way, but works
if (req.url ~ "^/varnishpurge") {
if (!client.ip ~ purge) {
error 405 "Not allowed.";
}
if (req.url == "/varnishpurge") {
ban("req.http.host == " + req.http.host + " && req.url ~ ^/");
error 841 "Purged site.";
}
else {
ban("req.http.host == " + req.http.host + " && req.url ~ ^" + regsub( req.url, "^/varnishpurge(.*)$", "\1" ) + "$");
error 842 "Purged page.";
}
}
# spoof the client IP (taken from http://utvbloggen.se/snabb-guide-till-varnish/)
remove req.http.X-Forwarded-For;
set req.http.X-Forwarded-For = client.ip;
# Force delivery from cache even if other things indicate otherwise
if (req.url ~ "\.(flv)") {
# pipe flash start away
return(pipe);
}
if (req.url ~ "\.(jpg|jpeg|gif|png|tiff|tif|svg|swf|ico|css|vsd|doc|ppt|pps|xls|pdf|mp3|mp4|m4a|ogg|mov|avi|wmv|sxw|zip|gz|bz2|tgz|tar|rar|odc|odb|odf|odg|odi|odp|ods|odt|sxc|sxd|sxi|sxw|dmg|torrent|deb|msi|iso|rpm)$") {
# cookies are irrelevant here
unset req.http.Cookie;
unset req.http.Authorization;
}
# Force short-circuit to the real site for these dynamic pages
if (req.url ~ "/customcf/" || req.url ~ "/uid/editData.cfm" || req.url ~ "^/private/") {
return(pass);
}
# Remove user agent, since Apache will server these resources the same way
if (req.http.User-Agent) {
set req.http.User-Agent = "";
}
if (req.http.Cookie) {
# removes all cookies named __utm? (utma, utmb...) - tracking thing
set req.http.Cookie = regsuball(req.http.Cookie, "(^|; ) *__utm.=[^;]+;? *", "\1");
# remove cStates for RHM boxes (the server doesn't need to know these, JS will handle this client-side)
set req.http.cookie = regsub(req.http.cookie, "(; )?cStates=[^;]*", ""); #cStates might sometimes have a blank value
# remove ColdFusion session cookie stuff
if (!req.url ~ "^/publications/" && !req.url ~ "^/uid/admin/") {
set req.http.cookie = regsub(req.http.cookie, "(; )?CFID=[^;]+", "");
set req.http.cookie = regsub(req.http.cookie, "(; )?CFTOKEN=[^;]+", "");
}
# Remove the cookie header if it's empty after cleanup
if (req.http.cookie ~ "^;? *$") {
# The only cookie data left is a semicolon or spaces
remove req.http.cookie;
}
}
}
#
# Called when the requested object was not found in the cache
#
sub vcl_hit {
# Allow administrators to easily flush the cache from their browser
if (client.ip ~ CTRLF5) {
if (req.http.pragma ~ "no-cache" || req.http.Cache-Control ~ "no-cache") {
set obj.ttl = 0s;
return(pass);
}
}
}
#
# Called when the requested object has been retrieved from the
# backend, or the request to the backend has failed
#
sub vcl_fetch {
set beresp.grace = 1h;
# strip the cookie before the image is inserted into cache.
if (req.url ~ "\.(jpg|jpeg|gif|png|tiff|tif|svg|swf|ico|css|vsd|doc|ppt|pps|xls|pdf|mp3|mp4|m4a|ogg|mov|avi|wmv|sxw|zip|gz|bz2|tgz|tar|rar|odc|odb|odf|odg|odi|odp|ods|odt|sxc|sxd|sxi|sxw|dmg|torrent|deb|msi|iso|rpm)$") {
remove beresp.http.set-cookie;
set beresp.ttl = 100w;
}
# Remove CF session cookies for everything but the publications subsite
if (!req.url ~ "^/publications/" && !req.url ~ "/customcf/" && !req.url ~ "^/uid/admin/" && !req.url ~ "^/uid/editData.cfm") {
remove beresp.http.set-cookie;
}
if (beresp.ttl < 48h) {
set beresp.ttl = 48h;
}
}
#
# Called before a cached object is delivered to the client
#
sub vcl_deliver {
# We'll be hiding some headers added by Varnish. We want to make sure people are not seeing we're using Varnish.
remove resp.http.X-Varnish;
remove resp.http.Via;
# We'd like to hide the X-Powered-By headers. Nobody has to know we can run PHP and have version xyz of it.
remove resp.http.X-Powered-By;
}
任何人都可以看到问题吗?
更新:根据http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.9.3
Note: When handling an If-Modified-Since header field, some
servers will use an exact date comparison function, rather than a
less-than function, for deciding whether to send a 304 (Not
Modified) response.
这似乎可能是Varnish的行为。我发送的是另一个日期,该日期在真实文件的最后修改日期之前,但不完全是在Varnish中缓存的内容。
答案 0 :(得分:7)
问题是If-Modified-Since请求标头中的非GMT时区:
If-Modified-Since: Wed, 15 Feb 2012 07:25:00 CET
根据http://www.w3.org/Protocols/rfc2616/rfc2616-sec3.html#sec3.3
所有HTTP日期/时间戳必须以格林威治标准时间(GMT)表示,无一例外。
Varnish将此作为严格要求实施,而Apache则更加健壮地处理非标准日期格式。这就是您在直接查询Apache时观察到不同行为的原因。
答案 1 :(得分:2)
由于这个问题仍然没有答案和几张投票,我会发一个答案。
这似乎不是Varnish 3.0.0(我们正在使用)或您在网站上运行的当前版本的Varnish的问题。
请求具有过期If-Modified-Since标头的内容时,200 OK响应:
# curl -z "Wed, 09 Jun 2010 16:07:50 GMT" --head "www.quotaproject.org/robots.txt"
HTTP/1.1 200 OK
Server: Apache
Last-Modified: Tue, 22 Jan 2013 13:23:41 GMT
Vary: Accept-Encoding
Cache-Control: public
Content-Type: text/plain; charset=UTF-8
Date: Mon, 25 Nov 2013 15:00:45 GMT
Age: 69236
Connection: keep-alive
X-Cache: HIT
# curl -z "Wed, 09 Jun 2013 16:07:50 GMT" --head "www.quotaproject.org/robots.txt"
HTTP/1.1 304 Not Modified
Server: Apache
Last-Modified: Tue, 22 Jan 2013 13:23:41 GMT
Vary: Accept-Encoding
Cache-Control: public
Content-Type: text/plain; charset=UTF-8
Date: Mon, 25 Nov 2013 15:00:52 GMT
Age: 69243
Connection: keep-alive
X-Cache: HIT
与您在varnishlog输出中提供的示例相同:
# curl -z "Wed, 15 Feb 2012 07:25:00 CET" --head "www.quotaproject.org/fr/CS/CS_AU-Maboreke-6-6-2004.pdf"
HTTP/1.1 304 Not Modified
Server: Apache
Last-Modified: Wed, 09 Jun 2004 16:07:50 GMT
Cache-Control: public
Content-Type: application/pdf
Accept-Ranges: bytes
Date: Mon, 25 Nov 2013 15:08:48 GMT
Age: 335802
Connection: keep-alive
X-Cache: HIT
我会说Varnish按预期工作。也许这是您使用的Varnish构建的问题,或者测试方法存在一些问题。我也看不到你的VCL有任何问题。