如何获取基于HttpComponent的html内容

时间:2014-02-24 00:29:09

标签: java apache httpclient

我正在尝试基于HttpComponent库获取html内容。

这是我的代码:

import org.apache.http.HttpEntity;
import org.apache.http.HttpResponse;
import org.apache.http.client.ClientProtocolException;
import org.apache.http.client.ResponseHandler;
import org.apache.http.client.methods.HttpGet;
import org.apache.http.client.utils.URIBuilder;
import org.apache.http.impl.client.CloseableHttpClient;
import org.apache.http.impl.client.HttpClients;
import org.apache.http.util.EntityUtils;

import java.io.IOException;
import java.net.URI;
import java.net.URISyntaxException;

public class test {

    public static void main(String[] args) throws URISyntaxException, IOException {

        CloseableHttpClient httpclient = HttpClients.createDefault();

        URI uri = new URIBuilder()
                    .setScheme("http")
                    .setHost("sandbox.ala.org.au")
                    .setPath("/datacheck/dataCheck/processData")
                    .setParameter("headers", "vernacularName")
                    .setParameter("firstLineIsData", "true")
                    .setParameter("rawData", "aaa")
                    .build();

        HttpGet httpget = new HttpGet(uri.toString());
        System.out.println(httpget.getRequestLine());

        ResponseHandler<String> responseHandler = new ResponseHandler<String>() {

            public String handleResponse(
                    final HttpResponse response) throws ClientProtocolException, IOException {
                int status = response.getStatusLine().getStatusCode();
                if (status >= 200 && status < 300) {
                    HttpEntity entity = response.getEntity();
                    return entity != null ? EntityUtils.toString(entity) : null;
                } else {
                    throw new ClientProtocolException("Unexpected response status: " + status);
                }
            }
        };

        String responseBody = httpclient.execute(httpget, responseHandler);
        System.out.println(responseBody);

    }
}

但收到了错误消息:

Exception in thread "main" org.apache.http.client.ClientProtocolException: Unexpected response status: 405

html内容应与运行curl命令相同:

curl --data "headers=vernacularName&firstLineIsData=true&rawData=aaa" http://sandbox.ala.org.au/datacheck/dataCheck/processData  

1 个答案:

答案 0 :(得分:0)

当您尝试在Java代码中使用GET请求时,您的curl命令正在执行POST请求。为此修改您的代码,您的问题就会消失。

如果在curl命令中添加-v标志,如果不确定,可以看到正在使用的方法。读数可能还包含一些其他有用的信息,如下所示:

> POST /datacheck/dataCheck/processData HTTP/1.1
> User-Agent: curl/7.22.0 (i686-pc-linux-gnu) libcurl/7.22.0 OpenSSL/1.0.1 zlib/1.2.3.4 libidn/1.23 librtmp/2.3
> Host: sandbox.ala.org.au
> Accept: */*
> Content-Length: 55
> Content-Type: application/x-www-form-urlencoded
>