来自页面的意外结果

时间:2012-04-07 19:38:13

标签: java html-parsing

我正在尝试从HTML表格中获取数据,但是当我连接到网站时,它不会返回它在浏览器中显示的内容。

这是我期望通过查看html结果得到的结果:

<div id="ResultsContainer">
    <div id="Pagination"><div class="left">displaying: 601 - 633 of 633</div><div class="right">
... 

这就是我得到的:

 <div id=ResultsContainer>
        <p class=RedBold10pt>Search returned no matches</p>
 </div>

这是我的Java代码

HttpClient client = new DefaultHttpClient();

HttpGet request = new HttpGet();
request.setHeader("User-Agent", "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_3) AppleWebKit/536.5 (KHTML, like Gecko) Chrome/19.0.1084.15 Safari/536.5");
request.setURI(new URI("http://results.active.com/pages/searchform.jsp?posted_p=t&numPerPage=50&page=0&rsID=10505&queryType=division#VIEW"));
HttpResponse response = client.execute(request);

BufferedReader in = null;
in = new BufferedReader(new InputStreamReader(response.getEntity().getContent()));
StringBuffer sb = new StringBuffer("");
String line = "";
String NL = System.getProperty("line.separator");

while ((line = in.readLine()) != null) {
    sb.append(line + NL);
}
in.close();
String page = sb.toString();
System.out.println(page);

这可能是什么原因?

1 个答案:

答案 0 :(得分:0)

问题在于Google AppEngine。我不得不从1.6.4降级到1.6.3