为什么httputil.NewSingleHostReverseProxy在某些www网站上导致错误?

时间:2015-07-30 05:07:14

标签: http go reverse-proxy

在下面的例子中:

package main

import (
    "fmt"
    "log"
    "net/http"
    "net/http/httputil"
    "net/url"
)

func main() {
    p := new(Proxy)
    //host := "www.google.com" // WORKS AS EXPECTED
    host := "www.apple.com" // GIVES AN ERROR
    u, err := url.Parse(fmt.Sprintf("http://%v/", host))
    if err != nil {
        log.Printf("Error parsing URL")
    }
    p.proxy = httputil.NewSingleHostReverseProxy(u)
    http.Handle("/", p)
    log.Fatal(http.ListenAndServe("localhost:8000", nil))
}

type Proxy struct {
    proxy *httputil.ReverseProxy
}

func (p *Proxy) ServeHTTP(w http.ResponseWriter, r *http.Request) {
    p.proxy.ServeHTTP(w, r)
}

交换' www.google.com'与' www.apple.com'将Chrome指向' localhost:8000':

时会导致此错误
  

网址无效

     

请求的网址" /",无效。   参考编号#9.a61a32b8.1438231668.41733295

进一步挖掘,对于www.apple.com,我得到了:

➜  ~  curl --ipv4 -v localhost:8000

< HTTP/1.1 400 Bad Request
< Content-Length: 194
< Content-Type: text/html
< Date: Thu, 30 Jul 2015 05:20:38 GMT
< Expires: Thu, 30 Jul 2015 05:20:38 GMT
< Mime-Version: 1.0
* Server AkamaiGHost is not blacklisted
< Server: AkamaiGHost
< 
<HTML><HEAD>
<TITLE>Invalid URL</TITLE>
</HEAD><BODY>
<H1>Invalid URL</H1>
The requested URL "&#47;", is invalid.<p>
Reference&#32;&#35;9&#46;65b454b8&#46;1438233638&#46;1f1b8a40
</BODY></HTML>
* Connection #0 to host localhost left intact

和www.google.com:

➜  ~  curl --ipv4 -v localhost:8000

< HTTP/1.1 302 Found
< Alternate-Protocol: 80:quic,p=0
< Cache-Control: private
< Content-Length: 219
< Content-Type: text/html; charset=UTF-8
< Date: Thu, 30 Jul 2015 05:03:16 GMT
< Location: http://www.google.com/
* Server sffe is not blacklisted
< Server: sffe
< X-Content-Type-Options: nosniff
< X-Xss-Protection: 1; mode=block
< 
<HTML><HEAD><meta http-equiv="content-type" content="text/html;charset=utf-8">
<TITLE>302 Moved</TITLE></HEAD><BODY>
<H1>302 Moved</H1>
The document has moved
<A HREF="http://www.google.com/">here</A>.
</BODY></HTML>
* Connection #0 to host localhost left intact

现在我使用&#39; apple.com&#39;而不是&#39; www.apple.com&#39;,事情很好:

➜  ~  curl --ipv4 -v localhost:8000

< HTTP/1.1 301 Moved Permanently
< Content-Type: text/html
< Date: 
< Location: http://www.apple.com/
< Referer: 
* Server  is not blacklisted
< Server: 
< Content-Length: 0
< 
* Connection #0 to host localhost left intact

发生了什么?

1 个答案:

答案 0 :(得分:5)

这里的问题是virtual servers;您要连接的某些网站不知道您要请求的域名(即Host HTTP标头字段设置为localhost:8000,而不是www.apple.com })。要解决此问题,反向代理必须重写Host标头。

不幸的是,httputil.NewSingleHostReverseProxy并没有提供一种简单的重写方式,因此我在下面添加的大部分内容都是从net/http/httputil source code复制的:

package main

import (
    "fmt"
    "log"
    "net/http"
    "net/http/httputil"
    "net/url"
    "strings"
)

func main() {
    p := new(Proxy)
    host := "www.apple.com"
    u, err := url.Parse(fmt.Sprintf("http://%v/", host))
    if err != nil {
        log.Printf("Error parsing URL")
    }

    targetQuery := u.RawQuery
    p.proxy = &httputil.ReverseProxy{
        Director: func(req *http.Request) {
            req.Host = host
            req.URL.Scheme = u.Scheme
            req.URL.Host = u.Host
            req.URL.Path = singleJoiningSlash(u.Path, req.URL.Path)
            if targetQuery == "" || req.URL.RawQuery == "" {
                req.URL.RawQuery = targetQuery + req.URL.RawQuery
            } else {
                req.URL.RawQuery = targetQuery + "&" + req.URL.RawQuery
            }
        },
    }


    http.Handle("/", p)
    log.Fatal(http.ListenAndServe("localhost:8000", nil))
}

func singleJoiningSlash(a, b string) string {
    aslash := strings.HasSuffix(a, "/")
    bslash := strings.HasPrefix(b, "/")
    switch {
    case aslash && bslash:
        return a + b[1:]
    case !aslash && !bslash:
        return a + "/" + b
    }
    return a + b
}

type Proxy struct {
    proxy *httputil.ReverseProxy
}

func (p *Proxy) ServeHTTP(w http.ResponseWriter, r *http.Request) {
    p.proxy.ServeHTTP(w, r)
}