我使用jsoup相对较新,我似乎无法找到正确的查询来解析我正在寻找的值。 HTML如下。
<img src='http://rootzwiki.com/public/style_images/ginger/t_unread.png' alt='New Replies' /><br />
</a>
</td>
<td class='col_f_content '>
<h4><a id="tid-link-12251" href="http://rootzwiki.com/topic/12251-romlte-rootzboat-403-v61/" title='View topic, started 17 December 2011 - 09:32 AM' class='topic_title'>[ROM][LTE] RootzBoat 4.0.3 V6.1</a></h4>
<br />
<span class='desc lighter blend_links'>
Started by <a hovercard-ref="member" hovercard-id="5" class="_hovertrigger url fn " href='http://rootzwiki.com/user/5-birdman/'>birdman</a>, 17 Dec 2011
</span>
<ul class='mini_pagination'>
<li><a href="http://rootzwiki.com/topic/12251-romlte-rootzboat-403-v61/" title='Go to page 1'>1</a></li>
<li><a href="http://rootzwiki.com/topic/12251-romlte-rootzboat-403-v61/page__st__10" title='Go to page 2'>2</a></li>
<li><a href="http://rootzwiki.com/topic/12251-romlte-rootzboat-403-v61/page__st__20" title='Go to page 3'>3</a></li>
<li><a href="http://rootzwiki.com/topic/12251-romlte-rootzboat-403-v61/page__st__1990" title='Go to page 200'>200 →</a></li>
</ul>
</td>
<td class='col_f_preview __topic_preview'>
<a href='http://rootzwiki.com/topic/12251-romlte-rootzboat-403-v61/' class='expander closed' title='Preview this topic'> </a>
</td>
<td class='col_f_views desc blend_links'>
<ul>
<li>
<span class='ipsBadge ipsBadge_orange'>Hot</span>
<a href="http://rootzwiki.com/index.php?app=forums&module=extras&section=stats&do=who&t=12251" onclick="return ipb.forums.retrieveWhoPosted( 12251 );">1,999 replies</a>
</li>
<li class='views desc'>180,213 views</li>
</ul>
</td>
<td class='col_f_post'>
<a href='http://rootzwiki.com/user/49940-jakeday/' class='ipsUserPhotoLink left'>
<img src='http://rootzwiki.com/uploads/profile/photo-thumb-49940.jpg' class='ipsUserPhoto ipsUserPhoto_mini' />
</a>
<ul class='last_post ipsType_small'>
<li><a hovercard-ref="member" hovercard-id="49940" class="_hovertrigger url fn " href='http://rootzwiki.com/user/49940-jakeday/'>jakeday</a></li>
<li>
<a href='http://rootzwiki.com/topic/12251-romlte-rootzboat-403-v61/page__view__getlastpost' title='Go to last post'>Today, 04:20 AM</a>
</li>
</ul>
</td>
我需要从那里解析birdman
。我知道,一旦我定义了元素,我可以用author.text();
获得“birdman”,但是我无法弄清楚如何定义author元素。我想也许下面的代码块可行,但正如我所提到的,我对jsoup和html很新,但它显然没有用。连接没有错,jsoup正在解析我解析出来的其他值。
TitleResults titleArray = new TitleResults();
Document doc = null;
try {
doc = Jsoup.connect(Constants.FORUM).get();
} catch (IOException e) {
e.printStackTrace();
}
Elements threads = doc.select(".topic_title");
for (Element thread : threads) {
titleArray = new TitleResults();
//Thread title
threadTitle = thread.text();
titleArray.setItemName(threadTitle);
//Thread link
String threadStr = thread.attr("abs:href");
String endTag = "/page__view__getnewpost"; //trim link
threadStr = new String(threadStr.replace(endTag, ""));
threadArray.add(threadStr);
titleArray.setAuthorDate("Author/Date");
results.add(titleArray);
}
Elements authors = doc.select("a[hovercard-ref]");
for (Element author : authors) {
if (author.attr("abs:href").contains("/user/")){
Log.d("POC", "SUCCESS " + author.attr("abs:href"));
} else {
Log.d("POC", "FAILURE " + author.text());
}
}
}
答案 0 :(得分:0)
我认为你的想法太难了;)
要获取链接的birdman
部分,只需使用以下内容:
Elements authors = doc.select("a");
for (Element author : authors) {
Log.d("POC", author.text());
}
"a"
检索所有链接。之后,你可以像你说的那样使用.text()
来检索价值。
答案 1 :(得分:0)