我正在使用Jsoup Java HTML解析器从特定URL获取图像。但是有些图像正在抛出状态502错误代码,而不是保存到我的机器上。这是我使用的代码快照: -
String url = "http://www.jabong.com";
String html = Jsoup.connect(url.toString()).get().html();
Document doc = Jsoup.parse(html, url);
images = doc.select("img");
for (Element element : images) {
String imgSrc = element.attr("abs:src");
log.info(imgSrc);
if (imgSrc != "") {
saveFromUrl(imgSrc, dirPath+"/" + nameCounter + ".jpg");
try {
Thread.sleep(3000);
} catch (InterruptedException e) {
log.error("error in sleeping");
}
nameCounter++;
}
}
saveFromURL函数如下所示: -
public static void saveFromUrl(String Url, String destinationFile) {
try {
URL url = new URL(Url);
InputStream is = url.openStream();
OutputStream os = new FileOutputStream(destinationFile);
byte[] b = new byte[2048];
int length;
while ((length = is.read(b)) != -1) {
os.write(b, 0, length);
}
is.close();
os.close();
} catch (IOException e) {
log.error("Error in saving file from url:" + Url);
//e.printStackTrace();
}
}
我在互联网上搜索了状态代码502,但它说错误是由于网关不好造成的。我不明白这一点。我认为这个错误的一个可能的原因可能是因为我在循环中向图像发送请求。可能是网络服务器无法处理这么大的负载,因此在没有发送前一个图像时拒绝对图像的请求。所以我试图在获取每个图像后放睡,但没有运气:( 请提出一些建议
答案 0 :(得分:1)
您的问题听起来像HTTP通信问题,因此您最好尝试使用库来处理通信方面的问题。看看Apache Commons HttpClient。
有关您的代码示例的一些注意事项。您尚未使用URLConnection
对象,因此不清楚Web /代理服务器的行为以及干净地关闭资源等等。提到的HttpCommon库将在这方面提供帮助。
似乎还有一些使用J2ME libararies做你想做的事情的例子。不是我亲自使用的东西,但也可以帮助你。
答案 1 :(得分:1)
这是一个适合我的完整代码示例...
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.InputStream;
import java.net.Authenticator;
import java.net.HttpURLConnection;
import java.net.InetSocketAddress;
import java.net.MalformedURLException;
import java.net.Proxy;
import java.net.SocketAddress;
import java.net.URL;
public class DownloadImage {
public static void main(String[] args) {
// URLs for Images we wish to download
String[] urls = {
"http://cdn.sstatic.net/stackoverflow/img/apple-touch-icon.png",
"http://www.google.co.uk/images/srpr/logo3w.png",
"http://i.microsoft.com/global/en-us/homepage/PublishingImages/sprites/microsoft_gray.png"
};
for(int i = 0; i < urls.length; i++) {
downloadFromUrl(urls[i]);
}
}
/*
Extract the file name from the URL
*/
private static String getOutputFileName(URL url) {
String[] urlParts = url.getPath().split("/");
return "c:/temp/" + urlParts[urlParts.length-1];
}
/*
Assumes there is no Proxy server involved.
*/
private static void downloadFromUrl(String urlString) {
InputStream is = null;
FileOutputStream fos = null;
try {
URL url = new URL(urlString);
System.out.println("Reading..." + url);
HttpURLConnection conn = (HttpURLConnection)url.openConnection(proxy);
is = conn.getInputStream();
String filename = getOutputFileName(url);
fos = new FileOutputStream(filename);
byte[] readData = new byte[1024];
int i = is.read(readData);
while(i != -1) {
fos.write(readData, 0, i);
i = is.read(readData);
}
System.out.println("Created file: " + filename);
}
catch (MalformedURLException e) {
e.printStackTrace();
}
catch (IOException e) {
e.printStackTrace();
}
finally {
if(is != null) {
try {
is.close();
} catch (IOException e) {
System.out.println("Big problems if InputStream cannot be closed");
}
}
if(fos != null) {
try {
fos.close();
} catch (IOException e) {
System.out.println("Big problems if FileOutputSream cannot be closed");
}
}
}
System.out.println("Completed");
}
}
您应该在控制台上看到以下输出...
Reading...http://cdn.sstatic.net/stackoverflow/img/apple-touch-icon.png
Created file: c:/temp/apple-touch-icon.png
Completed
Reading...http://www.google.co.uk/images/srpr/logo3w.png
Created file: c:/temp/logo3w.png
Completed
Reading...http://i.microsoft.com/global/en-us/homepage/PublishingImages/sprites/microsoft_gray.png
Created file: c:/temp/microsoft_gray.png
Completed
这是一个没有代理服务器的工作示例。
仅当您需要使用代理服务器进行身份验证时,这里还有一个基于此Oracle technote
所需的其他类import java.net.Authenticator;
import java.net.PasswordAuthentication;
public class ProxyAuthenticator extends Authenticator {
private String userName, password;
public ProxyAuthenticator(String userName, String password) {
this.userName = userName;
this.password = password;
}
protected PasswordAuthentication getPasswordAuthentication() {
return new PasswordAuthentication(userName, password.toCharArray());
}
}
要使用这个新类,您将使用以下代码代替上面显示的对openConnection()的调用
...
try {
URL url = new URL(urlString);
System.out.println("Reading..." + url);
Authenticator.setDefault(new ProxyAuthenticator("username", "password");
SocketAddress addr = new InetSocketAddress("proxy.server.com", 80);
Proxy proxy = new Proxy(Proxy.Type.HTTP, addr);
HttpURLConnection conn = (HttpURLConnection)url.openConnection(proxy);
...