不同的HTTP响应 - 直接从浏览器发送并由Java Proxy中继时

时间:2012-07-27 04:51:28

标签: java http response

我正在编写一个JAVA中继代理服务,它充当浏览器和互联网之间的中间件。其目的是仅查看从浏览器传递Web请求和对浏览器的响应,并在以后离线解析这些响应。

我的JAVA代理在特定套接字上侦听来自浏览器的连接。当出现新连接时,它会读取浏览器请求标头,标识要连接的主机,创建与主机的连接并传递浏览器请求。解析浏览器请求和中继服务器响应的代码是下面给出的streamHTTPData()方法。在代码中,debugOut是标准的System.out。

该代码适用于大部分网站,但一些奇怪的问题出现在一些网站上,我无法查看主页。当我随机关注谷歌搜索链接时,我注意到了这种情况,并且遇到了一个论坛。我在Firefox浏览器中使用了HTTPFOX扩展,并注意到浏览器发送到JAVA程序并从那里发送到Web服务器的请求完全相同。但是,我在不使用JAVA中间件时收到HTTP 200响应,否则收到HTTP 404。我不确定问题是什么。任何人都可以指出我正确的方向。 HTTPFOX捕获的HTTP请求和响应如下所示。

private int streamHTTPData(InputStream in, OutputStream out,StringBuffer host, StringBuffer url, boolean waitForDisconnect) {
    // get the HTTP data from an InputStream, and send it to
    // the designated OutputStream
    StringBuffer header = new StringBuffer("");
    String data = "";
    int responseCode = 200;
    int contentLength = 0;
    int pos = -1;
    int byteCount = 0;

    try {
        // get the first line of the header, so we know the response code
        data = readLine(in);
        if (data != null) {
            header.append(data + "\r\n");
            pos = data.indexOf(" ");
            if ((data.toLowerCase().startsWith("http")) && (pos >= 0)
                    && (data.indexOf(" ", pos + 1) >= 0)) {
                String rcString = data.substring(pos + 1,
                        data.indexOf(" ", pos + 1));
                try {
                    responseCode = Integer.parseInt(rcString);
                } catch (Exception e) {
                    if (debugLevel > 0)
                        debugOut.println("Error parsing response code "
                                + rcString);
                }
            } else {
                if ((pos >= 0) && (data.indexOf(" ", pos + 1) >= 0)) {
                    String suffix = data.substring(pos + 1,
                            data.indexOf(" ", pos + 1));
                    url.setLength(0);
                    url.append(suffix.trim());
                }
            }
        }

        // get the rest of the header info
        while ((data = readLine(in)) != null) {
            // the header ends at the first blank line
            if (data.length() == 0)
                break;
            header.append(data + "\r\n");

            // check for the Host header
            pos = data.toLowerCase().indexOf("host:");
            if (pos >= 0) {
                host.setLength(0);
                host.append(data.substring(pos + 5).trim());
            }

            // check for the Content-Length header
            pos = data.toLowerCase().indexOf("content-length:");
            if (pos >= 0)
                contentLength = Integer.parseInt(data.substring(pos + 15)
                        .trim());
        }

        // add a blank line to terminate the header info
        header.append("\r\n");

        // convert the header to a byte array, and write it to our stream
        out.write(header.toString().getBytes(), 0, header.length());
        System.out.println(header.toString());
        // if the header indicated that this was not a 200 response,
        // just return what we've got if there is no Content-Length,
        // because we may not be getting anything else
        if ((responseCode != 200) && (contentLength == 0)) {
            out.flush();
            return header.length();
        }

        // get the body, if any; we try to use the Content-Length header to
        // determine how much data we're supposed to be getting, because
        // sometimes the client/server won't disconnect after sending us
        // information...
        if (contentLength > 0)
            waitForDisconnect = false;

        if ((contentLength > 0) || (waitForDisconnect)) {
            try {
                byte[] buf = new byte[4096];
                int bytesIn = 0;
                while (((byteCount < contentLength) || (waitForDisconnect))
                        && ((bytesIn = in.read(buf)) >= 0)) {
                    out.write(buf, 0, bytesIn);
                    out.flush();
                    byteCount += bytesIn;
                }
            } catch (Exception e) {
                String errMsg = "Error getting HTTP body: " + e;
                if (debugLevel > 0)
                    debugOut.println(errMsg);
            }
        }
    } catch (Exception e) {
        if (debugLevel > 0)
            debugOut.println("Error getting HTTP data: " + e);
    }

    // flush the OutputStream and return
    try {
        out.flush();
    } catch (Exception e) {
    }
    return (header.length() + byteCount);
}

HTTP请求(有和没有中间件):

(Request-Line)  GET / HTTP/1.1
Host    andhrawatch.com
User-Agent  Mozilla/5.0 (Windows NT 6.1; WOW64; rv:13.0) Gecko/20100101 Firefox/13.0.1
Accept  text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language en-us,en;q=0.5
Accept-Encoding gzip, deflate
Proxy-Connection    keep-alive

没有JAVA中间件的HTTP响应:

(Status-Line)   HTTP/1.1 200 OK
Date    Fri, 27 Jul 2012 03:51:38 GMT
Server  Apache/2.2.14 (Unix) mod_ssl/2.2.14 OpenSSL/0.9.8e-fips-rhel5   mod_auth_passthrough/2.1 mod_bwlimited/1.4 FrontPage/5.0.2.2635
X-Powered-By    PHP/5.3.1
P3P CP="NOI ADM DEV PSAi COM NAV OUR OTRo STP IND DEM"
Expires Mon, 1 Jan 2001 00:00:00 GMT
Cache-Control   post-check=0, pre-check=0
Pragma  no-cache
Set-Cookie  0f486952816b6d6bf53a4c34b724b278=c68edaebc6dedb2b291832dfbfb784fc; path=/
Last-Modified   Fri, 27 Jul 2012 03:51:38 GMT
Keep-Alive  timeout=5, max=100
Connection  Keep-Alive
Transfer-Encoding   chunked
Content-Type    text/html; charset=utf-8   

使用JAVA中间件的HTTP响应

(Status-Line)   HTTP/1.1 404 Component not found
Date    Fri, 27 Jul 2012 03:54:39 GMT
Server  Apache/2.2.14 (Unix) mod_ssl/2.2.14 OpenSSL/0.9.8e-fips-rhel5            mod_auth_passthrough/2.1 mod_bwlimited/1.4 FrontPage/5.0.2.2635
X-Powered-By    PHP/5.3.1
P3P CP="NOI ADM DEV PSAi COM NAV OUR OTRo STP IND DEM"
Expires Mon, 1 Jan 2001 00:00:00 GMT
Cache-Control   post-check=0, pre-check=0
Pragma  no-cache
Set-Cookie  0f486952816b6d6bf53a4c34b724b278=33806d89181aa6d488ccba1b9163e511; path=/
Last-Modified   Fri, 27 Jul 2012 03:54:39 GMT
Transfer-Encoding   chunked
Content-Type    text/html; charset=utf-8

0 个答案:

没有答案