我只想获得解析数据页面的来源 我使用了以下代码
public static String getPageSourceFromUrl(String Url) {
String text = "";
try {
HttpClient client = new DefaultHttpClient();
HttpGet request = new HttpGet(Url);
HttpResponse response = client.execute(request);
InputStream in = response.getEntity().getContent();
BufferedReader reader = new BufferedReader(
new InputStreamReader(in));
StringBuilder str = new StringBuilder();
String line = null;
while ((line = reader.readLine()) != null) {
str.append(line);
}
in.close();
text = str.toString();
} catch (Exception ex) {
}
return text;
}
,页面为:“http://vn.answers.yahoo.com/question/index?qid=20111022210730AAWvfKI”
不幸的是,在某些页面(如上图所示)中,文本返回只是整个源代码的一部分。也许字符串超出限制,所以任何人都有任何解决方案?
答案 0 :(得分:0)
我设法解决了(我们的)问题。优雅的方法是创建一个类:
public class GetRating {
public GetRating(){
}
public String GetRatingFromURL(String url){
HttpClient httpclient = new DefaultHttpClient(); // Create HTTP Client
HttpGet httpget = new HttpGet(url); // Set the action you want to do
HttpResponse response = null;
try {
response = httpclient.execute(httpget);
} catch (ClientProtocolException e1) {
// TODO Auto-generated catch block
e1.printStackTrace();
} catch (IOException e1) {
// TODO Auto-generated catch block
e1.printStackTrace();
} // Executeit
HttpEntity entity = response.getEntity();
InputStream is = null;
try {
is = entity.getContent();
} catch (IllegalStateException e1) {
// TODO Auto-generated catch block
e1.printStackTrace();
} catch (IOException e1) {
// TODO Auto-generated catch block
e1.printStackTrace();
} // Create an InputStream with the response
BufferedReader reader = null;
try {
reader = new BufferedReader(new InputStreamReader(is, "iso-8859-1"), 8);
} catch (UnsupportedEncodingException e1) {
// TODO Auto-generated catch block
e1.printStackTrace();
}
StringBuilder sb = new StringBuilder();
String line = null;
try {
while ((line = reader.readLine()) != null) // Read line by line
sb.append(line + "\n");
} catch (IOException e1) {
// TODO Auto-generated catch block
e1.printStackTrace();
}
String resString = sb.toString(); // Result is here
return resString;
}
}
然后从另一个类中调用它:
GetRating GN = new GetRating();
String pagesource = GN.GetRatingFromURL(url)("http://somewebpage.com");