无法使用POST请求检索第2页

时间:2014-02-13 11:17:55

标签: java http

我正在尝试检索this网址的第2页的HTML。我添加了所需的帖子表单数据的值,如__EVENTTARGET和__EVENTARGUMENT,但这仍然只返回第一页。我可能错过了什么线索?

import java.io.BufferedReader;
import java.io.InputStreamReader;
import java.io.Reader;
import java.net.HttpURLConnection;
import java.net.URL;
import java.net.URLEncoder;
import java.util.LinkedHashMap;
import java.util.Map;

public class HttpURLConnectionExample {
public static void main(String[] args) throws Exception {
    URL url = new URL(
            "http://www.themetaldirectory.com/?featured=0&country=USA");
    Map<String, Object> params = new LinkedHashMap<>();
    params.put("__EVENTTARGET", "ctl00%24ctl00%24ContentPlaceHolderDefault%24DirectoryDisplay_12%24GridView1");
    params.put("__EVENTARGUMENT", "Page$2");

    StringBuilder postData = new StringBuilder();
    for (Map.Entry<String, Object> param : params.entrySet()) {
        if (postData.length() != 0)
            postData.append('&');
        postData.append(URLEncoder.encode(param.getKey(), "UTF-8"));
        postData.append('=');
        postData.append(URLEncoder.encode(String.valueOf(param.getValue()),
                "UTF-8"));
    }
    byte[] postDataBytes = postData.toString().getBytes("UTF-8");

    HttpURLConnection conn = (HttpURLConnection) url.openConnection();
    conn.setRequestMethod("POST");
    conn.setRequestProperty("Content-Type",
            "application/x-www-form-urlencoded");
    conn.setRequestProperty("Content-Length",
            String.valueOf(postDataBytes.length));
    conn.setDoOutput(true);
    conn.getOutputStream().write(postDataBytes);

    Reader in = new BufferedReader(new InputStreamReader(
            conn.getInputStream(), "UTF-8"));
    for (int c; (c = in.read()) >= 0; System.out.print((char) c))
        ;
}

}

1 个答案:

答案 0 :(得分:2)

您发送的__EVENTTARGET参数值不正确。

预期值为

ctl00$ctl00$ContentPlaceHolderDefault$DirectoryDisplay_12$GridView1

但是您要将值作为URL编码两次发送,首先是在设置参数值

params.put("__EVENTTARGET",
"ctl00%24ctl00%24ContentPlaceHolderDefault%24DirectoryDisplay_12%24GridView1");

,第二次编码

时的参数值
postData.append(URLEncoder.encode(String.valueOf(param.getValue()), "UTF-8"));

所以你真的发送了一个双重URL编码值

ctl00%2524ctl00%2524ContentPlaceHolderDefault%2524DirectoryDisplay_12%2524GridView1

首先设置没有URL编码的参数值,并且应该可以正常工作

 params.put("__EVENTTARGET",
 "ctl00$ctl00$ContentPlaceHolderDefault$DirectoryDisplay_12$GridView1");

修改

如果要将生成的InputStream保存为String,则可以采用多种方法,其中之一如下:

StringBuilder inputStringBuilder = new StringBuilder();
BufferedReader br = new BufferedReader(new InputStreamReader(conn.getInputStream(), "UTF-8"));

String line;
while ((line = br.readLine()) != null) {
    inputStringBuilder.append(line);
}
String htmlString = inputStringBuilder.toString();