我想获取网页并将内容保存为字符串?有没有图书馆可以做到这一点?我想将字符串用于我正在构建的程序。它适用于网站,不一定提供RSS提要。
答案 0 :(得分:3)
我认为你需要这个
URL url = new URL("http://www.google.com/");
URLConnection con = url.openConnection();
InputStream in = con.getInputStream();
String encoding = null; // con.getContentEncoding(); *** WRONG: should use "con.getContentType()" instead but it returns something like "text/html; charset=UTF-8" so this value must be parsed to extract the actual encoding
encoding = encoding == null ? "UTF-8" : encoding;
String body = IOUtils.toString(in, encoding);
System.out.println(body);
答案 1 :(得分:1)
我可以建议JSoup吗?
Document doc = Jsoup.connect("www.google.com").get();
答案 2 :(得分:0)
CloseableHttpClient httpclient = HttpClients.createDefault();
HttpGet httpget = new HttpGet("http://www.google.gr");
try (CloseableHttpResponse response = httpclient.execute(httpget)) {
HttpEntity entity = response.getEntity();
if (entity != null) {
System.out.println(EntityUtils.toString(entity));
}
response.close();
} catch (IOException ex) {
Logger.getLogger(HttpClient.class.getName()).log(Level.SEVERE, null, ex);
}