我正在使用此代码:
public static void main(String[] args) throws IOException {
String EngLink;
URL EngUrl;
URLConnection EngCon;
String cookiesHeader;
InputStream EngIs;
BufferedReader EngBr;
String line;
String EngPageHtml="";
EngLink="https://www.zomato.com/";
EngUrl = new URL(EngLink);
EngCon = (HttpURLConnection) EngUrl.openConnection();
EngCon.setRequestProperty("User-Agent", "Mozilla/5.0 (Windows; U; Windows NT 6.1; en-GB; rv:1.9.2.13) Gecko/20101203 Firefox/3.6.13 (.NET CLR 3.5.30729)");
EngIs = EngCon.getInputStream();
EngBr = new BufferedReader(new InputStreamReader(EngIs,"UTF-8"));
while ((line = EngBr.readLine()) != null) {
EngPageHtml = EngPageHtml + "\n" + line;
}
System.out.println(EngPageHtml);
}
我想要做的是获取网站的原始html。 但是,当我运行代码时,我收到此错误:
Exception in thread "main" java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.socketRead(Unknown Source)
at java.net.SocketInputStream.read(Unknown Source)
at java.net.SocketInputStream.read(Unknown Source)
at sun.security.ssl.InputRecord.readFully(Unknown Source)
at sun.security.ssl.InputRecord.read(Unknown Source)
at sun.security.ssl.SSLSocketImpl.readRecord(Unknown Source)
at sun.security.ssl.SSLSocketImpl.readDataRecord(Unknown Source)
at sun.security.ssl.AppInputStream.read(Unknown Source)
at java.io.BufferedInputStream.fill(Unknown Source)
at java.io.BufferedInputStream.read1(Unknown Source)
at java.io.BufferedInputStream.read(Unknown Source)
at sun.net.www.http.HttpClient.parseHTTPHeader(Unknown Source)
at sun.net.www.http.HttpClient.parseHTTP(Unknown Source)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(Unknown Source)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(Unknown Source)
at sun.net.www.protocol.https.HttpsURLConnectionImpl.getInputStream(Unknown Source)
at project1.Aaa.main(Aaa.java:33)
我正在使用此代码成功获取多个其他网站的HTML,但这个特定的网站不起作用。
可能是什么问题,我该如何解决这个问题?
编辑:在Firefox中加载网站,从那里获取cookie并将其传递到:
EngCon.setRequestProperty("Cookie",cookie);
使页面加载,但这不好,因为它不能一次又一次地使用。
答案 0 :(得分:0)
这是通过添加另一个请求属性来解决的:
EngCon.setRequestProperty("Accept-Language", "en-US,en;q=0.5");
不需要其他任何东西