所以我使用Apache Commons HTTP向网页发出请求。我不能为我的生活弄清楚如何从页面获取实际内容,我可以得到它的标题信息。如何从中获取实际内容?
这是我的示例代码:
HttpGet request = new HttpGet("http://URL_HERE/");
HttpClient httpClient = new DefaultHttpClient();
HttpResponse response = httpClient.execute(request);
System.out.println("Response: " + response.toString());
谢谢!
答案 0 :(得分:15)
BalusC的评论会很好。
如果您使用的是版本4或更高版本的Apache HttpComponents,您也可以使用一种便捷方法:
EntityUtils.toString(HttpEntity);
以下是代码中的内容:
HttpGet request = new HttpGet("http://URL_HERE/");
HttpClient httpClient = new DefaultHttpClient();
HttpResponse response = httpClient.execute(request);
HttpEntity entity = response.getEntity();
String entityContents = EntityUtils.toString(entity);
我希望这对你有所帮助。
不确定这是否是由于不同的版本,但我不得不重写它:
HttpGet request = new HttpGet("http://URL_HERE/");
CloseableHttpClient httpClient = HttpClients.createDefault();
HttpResponse response = httpClient.execute(request);
HttpEntity entity = response.getEntity();
String entityContents = EntityUtils.toString(entity);
答案 1 :(得分:11)
使用HttpResponse#getEntity()
然后HttpEntity#getContent()
将其作为InputStream
获取。
InputStream input = response.getEntity().getContent();
// Read it the usual way.
请注意,HttpClient不属于Apache Commons。这是Apache HttpComponents的一部分。
答案 2 :(得分:1)
response.getEntity();
你真的想看看Javadocs,HttpClient的例子告诉你如何获取响应中的所有信息:http://hc.apache.org/httpcomponents-client-ga/httpclient/apidocs/index.html
答案 3 :(得分:1)
如果您只想要URL的内容,可以使用URL API,如下所示:
import java.io.IOException;
import java.net.URL;
import java.util.Scanner;
public class URLTest {
public static void main(String[] args) throws IOException {
URL url = new URL("http://www.google.com.br");
//here you have the input stream, so you can do whatever you want with it!
Scanner in = new Scanner(url.openStream());
in.nextLine();
}
}