我正在使用URLConnection类 - 我希望能够获取到给定URL的流,即使所述URL不可用(即缓存URL上的内容的最后已知应对,到某个本地文件系统目录) - 现在我已经写了几次这个代码(从不满意)并且想知道是否有更好的东西可以做到这一点。
答案 0 :(得分:1)
如果您从URLConnection
切换到Apache HttpClient,则可以使用HttpClient Cache。
答案 1 :(得分:-1)
我刚遇到同样的问题,把我自己的WebCache类放在一起..还没有测试过,但如果你愿意,你可以尝试一下。只需使用要缓存页面的目录构建它,然后调用getPage(String url)来获取页面。 getPage首先检查缓存目录,如果它不存在,则将其下载到缓存并返回结果。缓存文件名是url.hashCode()+“。cache”
这只是为了获取页面的来源,我不确定你想用你的URLConnection做什么,但这可能会有所帮助。
/**
* A tool for downloading and reading the source code of HTML pages.
* Prevents repeated downloading of pages by storing each page in a cache.
* When it recieves a page request, it first looks in its cache.
* If it does not have the page cached, it will download it.
*
* Pages are stored as <cachedir>/<hashcode>.cache
*
* @author Mike Turley
*/
import java.io.*;
import java.net.*;
public class WebCache {
File cachedir;
boolean enabled;
/**
* Create a web cache in the given directory.
*/
public WebCache(File cachedir, boolean enabled) {
this.cachedir = cachedir;
this.enabled = enabled;
}
public WebCache(String cachedir, boolean enabled) {
this.cachedir = new File(cachedir);
this.enabled = enabled;
}
public WebCache(File cachedir) {
this.cachedir = cachedir;
this.enabled = true;
}
public WebCache(String cachedir) {
this.cachedir = new File(cachedir);
this.enabled = true;
}
/**
* Get the content for the given URL.
* First check the cache, then check the internet.
*/
public String getPage(String url) {
try {
if(enabled) {
File cachefile = new File(cachedir.getAbsolutePath() + url.hashCode() + ".cache");
//FIXME - might be missing a slash between path and hashcode.
if(cachefile.exists()) return loadCachedPage(url);
}
return downloadPage(url);
} catch(Exception e) {
System.err.println("Problem getting page at " + url);
e.printStackTrace();
return null;
}
}
public void clear() {
try {
File[] cachefiles = cachedir.listFiles();
for(int i=0; i<cachefiles.length; i++) {
cachefiles[i].delete();
}
cachedir.delete();
} catch(Exception e) {
System.err.println("Problem clearing the cache!");
e.printStackTrace();
}
}
public String downloadPage(String url) {
try {
URL weburl = new URL(url);
URLConnection urlc = weburl.openConnection();
urlc.setDoInput(true);
urlc.setDoOutput(false);
BufferedReader in = new BufferedReader(new InputStreamReader(urlc.getInputStream()));
if(!cachedir.exists()) cachedir.mkdir();
File outfile = new File(cachedir.getAbsolutePath() + url.hashCode() + ".cache");
// FIXME - might be missing a slash between path and hashcode.
PrintWriter out = new PrintWriter(new BufferedWriter(new FileWriter(outfile)));
StringBuilder sb = new StringBuilder("");
String inputline;
while ((inputline = in.readLine()) != null) {
out.println(inputline);
sb.append(inputline);
}
in.close();
out.close();
return sb.toString();
} catch(Exception e) {
System.err.println("Problem connecting to URL " + url);
e.printStackTrace();
return null;
}
}
public String loadCachedPage(String url) {
try {
File infile = new File(cachedir.getAbsolutePath() + url.hashCode() + ".cache");
// FIXME - might be missing a slash between path and hashcode.
BufferedReader in = new BufferedReader(new FileReader(infile));
StringBuilder sb = new StringBuilder("");
while (in.ready()) sb.append(in.readLine());
in.close();
return sb.toString();
} catch(Exception e) {
System.err.println("Problem loading cached page " + url);
e.printStackTrace();
return null;
}
}
public void setEnabled(boolean enabled) {
this.enabled = enabled;
}
}
答案 2 :(得分:-1)
不要那样做。部署缓存HTTP代理。例如Apache Squid。