适当的HTTP GET请求

时间:2013-12-12 05:52:52

标签: java html http get httprequest

我理解HTTP GET请求的概念有些困难,除了我知道它要求从服务器读取网页这一事实。今天我写了一个试图使用HTTP GET请求的类,以便访问网页上的html资料。让我把课程包括在内并解释我的困惑:

    import java.io.*;
import java.net.*;

public class HTMLFetcher 
{
    private static final int PORT = 80;
    private URL url;


    public HTMLFetcher(String url) throws Exception // url = http://www.-----.com/birds.html
    {
        this.url = new URL(url);
        fetch(this.url.getHost());
    }

    private  String createRequest(URL url) { // Is there a problem with this request? 
        String request = "GET" + "/index.html" + "HTTP/1.1\n";
        request += "Host: www.cs.usfca.edu\n";
        request += "Connection: close";
        request += "\r\n";
        return request;
        }

    public void fetch(String urlDomain) throws Exception {

        System.out.println(urlDomain + ":" + PORT);

        // TODO: create a new socket here for a given urlDomain and a given PORT
        Socket socket = new Socket(urlDomain, PORT);

        // TODO: create PrintWriter for the socket's output stream
        PrintWriter writer = 
                new PrintWriter(new OutputStreamWriter(socket.getOutputStream()));

        BufferedReader reader = 
                new BufferedReader(new InputStreamReader(socket.getInputStream()));

        String request = createRequest(urlDomain); // createRequest is complaining       that it is a string and not a URL 
        System.out.println(request);
        writer.write(request);
        writer.flush();

        StringBuilder string = new StringBuilder();
        boolean htmlFound = false;
        String line;
        while ((line = reader.readLine()) != null) {
            if (!htmlFound) {
                if (line.toLowerCase().startsWith("<html>")) {
                    htmlFound = true;
                } else {
                    continue;
                }
            }
            System.out.println("This is each line: " + line);
            string.append(line + "\n");
        }

        reader.close();
        writer.close();
        socket.close();

        //System.out.println(string.toString());
        System.out.println("[done]");
    }
    }

所以基本上我很困惑如何在期待URL时将String urlDomain发送到createRequest方法? HTTP请求是否需要createMethod参数?我是否正确设置了请求?

现在输出以下内容:

www.cs.usfca.edu:80
GET/index.htmlHTTP/1.1
Host: www.cs.usfca.edu
Connection: close

This is each line: <html><head>
This is each line: <title>501 Method Not Implemented</title>
This is each line: </head><body>
This is each line: <h1>Method Not Implemented</h1>
This is each line: <p>GET/index.htmlHTTP/1.1 to /index.html not supported.<br />
This is each line: </p>
This is each line: <hr>
This is each line: <address>Apache/2.2.15 (CentOS) Server at www.cs.usfca.edu Port 80</address>
This is each line: </body></html>
[done]

感谢您的帮助。如果我可以更具体,请告诉我。谢谢。

1 个答案:

答案 0 :(得分:1)

据我所知,当网站位于共享主机服务器上时,会使用请求中的主机头,其中多个域将映射到同一个ip,服务器需要Host标头来标识虚拟服务器请求被路由到哪个。因此,最好将其包含在请求中。

顺便说一下,在当前代码中,请求字符串中没有空格。这就是为什么你得到错误html作为回应。

private String createRequest(String url) { // Is there a problem with this request? 
    String request = "GET " + "/ " + "HTTP/1.1\r\n";
    request += "Host: www.cs.usfca.edu\n";
    request += "\r\n";
    return request;
}

另外,不要像这样检查

if (line.toLowerCase().startsWith("<html>")) 

改为使用

if (line.toLowerCase().startsWith("<html")) 
顺便说一下,你为什么要这么做呢?转而使用HTTPUrlConnection。