我想抓一个webpage。我创建了这段代码来获得得分div的值:
public String GetRatingAndVotesFromURL(String url){
ByteArrayOutputStream outputDoc = null;
String page = "";
String rating = "", votes = "";
String rating_helper="", votes_helper="";
try
{
InputStream is = new URL(url).openStream();
outputDoc = new ByteArrayOutputStream();
byte buf[]=new byte[1024];
int len;
while((len=is.read(buf))>0)
{
outputDoc.write(buf,0, len);
}
page = new String(outputDoc.toByteArray(), "UTF-8");
int start = page.indexOf("<div class=\"score\">")+18; //77956
int finish = start+5; //77966
rating = page.substring(start+1, finish).toString();
for (int i=0;i<5;i++){
if ( String.valueOf(rating.charAt(i)).equals("<")) break;
rating_helper += String.valueOf(rating.charAt(i));
}
}
catch(Exception e) { e.printStackTrace(); }
return rating_helper;
}
这很好用,但找到代码的一部分是一种奇怪的方法。
所以我改变了
int start = page.indexOf("<div class=\"score\">")+18; //77956
int finish = start+5; //77966
rating = page.substring(start+1, finish).toString();
for (int i=0;i<5;i++){
if ( String.valueOf(rating.charAt(i)).equals("<")) break;
rating_helper += String.valueOf(rating.charAt(i));
}
到
Pattern p = Pattern.compile("<div class=\"score\">([0-9,]+)</div>");
Matcher m = p.matcher(page);
if(m.matches()) {
rating_helper = m.group(1);
}
else rating_helper = "notfound";
但这不起作用,我总是得到&#34;没有发现&#34;。我做错了什么?