我正在尝试从此网站http://movies.about.com/od/actorsalphalist/Actors_Detailed_Movie_News_Interviews_Websites.htm
收集HTML我打开一个套接字,尝试读取并打印HTML页面的每一行。当我运行它时,我只得到“EOF为假”,然后“1”作为结果。
我不确定到底出了什么问题,因为我知道这应该在另一个例子中起作用...非常感谢你的帮助!
import java.net.*;
import java.io.*;
import java.util.*;
public class Twitter {
static final int DEFAULT_PORT = 80;
protected DataInputStream reply = null;
protected PrintStream send = null;
protected Socket sock = null;
// ***********************************************************
// *** The constructors create the socket and set up the input
// *** and output channels on that socket.
public Twitter() throws UnknownHostException, IOException {
this(DEFAULT_PORT);
}
public Twitter(int port) throws UnknownHostException, IOException {
sock = new Socket("movies.about.com", port);
System.out.println(sock);
reply = new DataInputStream(sock.getInputStream());
System.out.println();
send = new PrintStream(sock.getOutputStream());
}
// ***********************************************************
// *** forecast uses the socket that has already been created
// *** to carry on a conversation with the Web server that it
// *** has been contacted through the socket.
public void forecast() {
int i;
String HTMLline;
boolean eof, gotone;
// *** This issues the same query that a Web browser would issue
// *** to the Web server.
try {
send.println("GET /od/actorsalphalist/Actors_Detailed_Movie_News_Interviews_Websites.htm HTTP/1.1");
} catch (Exception e) {
System.out.println("about.com server is down.");
}
// *** This section parses the response from the Web server.
// *** NOTE THAT "real" EOF does not occur until the Web server
// *** has closed the connection.
eof = false;
gotone = false;
while (!eof) {
System.out.println("EOF is false");
try {
System.out.println("1");
HTMLline = reply.readLine();
System.out.println("2");
System.out.println(HTMLline);
System.out.println("Here?");
if (HTMLline != null) {
System.out.println("its not null");
}
if (HTMLline == null) {
System.out.println("WTFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF");
} else {
eof = true;
System.out.println("is it?");
}
} catch (Exception e) {
System.out.println("this exception happend");
e.printStackTrace();
eof = true;
}
}
}
// ***********************************************************
// *** We need to close the socket when this class is destroyed.
protected void finalize() throws Throwable {
sock.close();
}
// ***********************************************************
// *** The main program creates a new Twitter class and
// *** sends that class the command line args (via findNumber).
public static void main(String[] args) {
Twitter aboutCom;
DataInputStream cin = new DataInputStream(System.in);
try {
aboutCom = new Twitter();
aboutCom.forecast();
} catch (Exception e) {
e.printStackTrace();
}
}
}
答案 0 :(得分:1)
您尚未发送有效的HTTP请求,因此服务器仍在等待您完成该请求。 GET行必须以\ r \ n结尾,然后您需要另一个作为空行来分隔请求标题。
但是你应该为此使用URL,openConnection(),getInputStream()等,而不是冗余地尝试自己重新实现HTTP。正如你所做的那样,所有你得到的方法都是错误的机会。