我使用http客户端获取数据:
public static String getHttpResponse(String url) {
//LOGGER.info("Download page context from URL " + url);
String httpClientResponse = null;
try {
URI uri = new URIBuilder(url).build();
HttpResponse response;
HttpHost target = new HttpHost(uri.getHost());
HttpGet request = new HttpGet(uri);
//request.setConfig(config);
request.addHeader(new BasicHeader("User-Agent", "Mozilla/5.0"));
request.addHeader(new BasicHeader("Content-Type", "text/html"));
request.addHeader("Accept-Ranges", "bytes=100-1500");
org.apache.http.client.HttpClient
client = HttpClients.custom().build();
response = client.execute(target, request);
//LOGGER.info("Status Line for URL {} is {}", uri.getHost() + File.separator + uri.getPath(), response.getStatusLine());
InputStream inputStream = response.getEntity().getContent();
if (inputStream == null || response.getStatusLine().getStatusCode() != HttpStatus.SC_OK) {
/*LOGGER.error("Non-success response while downloading image. Response {}", response.getStatusLine());
LOGGER.error("Error while download data from url {}", url);*/
} else {
httpClientResponse = IOUtils.toString(inputStream, CharEncoding.UTF_8);
}
} catch (Exception e) {
System.out.println("Error while download content from URL");
}
return httpClientResponse;
}
另外:我们可以使用Jsoup吗?
感谢。
答案 0 :(得分:1)
替换:
request.addHeader("Accept-Ranges", "bytes=100-1500");
with:
request.addHeader("Range", "bytes=100-1500");
Accept-Ranges
标头是服务器响应的一部分,表示服务器接受部分请求。
在您的请求中,您应该使用Range
标头,它指示文档服务器的哪个部分应该返回。
https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Accept-Ranges https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Range