我正在尝试检索this网址的第2页的HTML。我添加了所需的帖子表单数据的值,如__EVENTTARGET和__EVENTARGUMENT,但这仍然只返回第一页。我可能错过了什么线索?
import java.io.BufferedReader;
import java.io.InputStreamReader;
import java.io.Reader;
import java.net.HttpURLConnection;
import java.net.URL;
import java.net.URLEncoder;
import java.util.LinkedHashMap;
import java.util.Map;
public class HttpURLConnectionExample {
public static void main(String[] args) throws Exception {
URL url = new URL(
"http://www.themetaldirectory.com/?featured=0&country=USA");
Map<String, Object> params = new LinkedHashMap<>();
params.put("__EVENTTARGET", "ctl00%24ctl00%24ContentPlaceHolderDefault%24DirectoryDisplay_12%24GridView1");
params.put("__EVENTARGUMENT", "Page$2");
StringBuilder postData = new StringBuilder();
for (Map.Entry<String, Object> param : params.entrySet()) {
if (postData.length() != 0)
postData.append('&');
postData.append(URLEncoder.encode(param.getKey(), "UTF-8"));
postData.append('=');
postData.append(URLEncoder.encode(String.valueOf(param.getValue()),
"UTF-8"));
}
byte[] postDataBytes = postData.toString().getBytes("UTF-8");
HttpURLConnection conn = (HttpURLConnection) url.openConnection();
conn.setRequestMethod("POST");
conn.setRequestProperty("Content-Type",
"application/x-www-form-urlencoded");
conn.setRequestProperty("Content-Length",
String.valueOf(postDataBytes.length));
conn.setDoOutput(true);
conn.getOutputStream().write(postDataBytes);
Reader in = new BufferedReader(new InputStreamReader(
conn.getInputStream(), "UTF-8"));
for (int c; (c = in.read()) >= 0; System.out.print((char) c))
;
}
}
答案 0 :(得分:2)
您发送的__EVENTTARGET
参数值不正确。
预期值为
ctl00$ctl00$ContentPlaceHolderDefault$DirectoryDisplay_12$GridView1
但是您要将值作为URL编码两次发送,首先是在设置参数值
时params.put("__EVENTTARGET",
"ctl00%24ctl00%24ContentPlaceHolderDefault%24DirectoryDisplay_12%24GridView1");
,第二次编码
时的参数值postData.append(URLEncoder.encode(String.valueOf(param.getValue()), "UTF-8"));
所以你真的发送了一个双重URL编码值
ctl00%2524ctl00%2524ContentPlaceHolderDefault%2524DirectoryDisplay_12%2524GridView1
首先设置没有URL编码的参数值,并且应该可以正常工作
params.put("__EVENTTARGET",
"ctl00$ctl00$ContentPlaceHolderDefault$DirectoryDisplay_12$GridView1");
修改强>
如果要将生成的InputStream保存为String,则可以采用多种方法,其中之一如下:
StringBuilder inputStringBuilder = new StringBuilder();
BufferedReader br = new BufferedReader(new InputStreamReader(conn.getInputStream(), "UTF-8"));
String line;
while ((line = br.readLine()) != null) {
inputStringBuilder.append(line);
}
String htmlString = inputStringBuilder.toString();