我对JSoup比较陌生。我正在尝试解析从这些行的网站上删除的HTML
.....
<FONT COLOR=#2D8F26 FACE="Arial"><B>Claim:</B></FONT> Photograph shows a Chicago Bears fan holding a crude sign at the <NOBR>2006-07</NOBR> <NOBR>NFC championship</NOBR> game.
<BR><BR>
<NOINDEX>
<FONT COLOR=#2D8F26 FACE="Arial"><B>Status:</B></FONT> <FONT COLOR=#FF0000 FACE="Arial"><B><I>True.</I></B></FONT>
</NOINDEX>
<BR><BR>
<FONT COLOR=#2D8F26 FACE="Arial"><B>Example:</B></FONT> <FONT COLOR=#2D8F26 FACE="Trebuchet MS,Bookman Old Style,Arial"><I>[Collected via e-mail, January 2007]</I></FONT>
<BR><BR>
<TABLE WIDTH=400 ALIGN=CENTER BORDER=0 BGCOLOR=#000000><TR><TD BGCOLOR=#EAF2E5>
<FONT FACE="Verdana" SIZE=2">
<DIV STYLE="text-align: justify; margin-top: 10px; margin-bottom: 10px; margin-left: 15px; margin-right: 15px">
The attached photo has been circulating around the Gulf Coast region for a couple of days now (since Saturday's Bears-Saints game). Do you have any word on whether it is authentic or doctored? Was this individual really that tasteless and crude?
<BR><BR>
<CENTER>
......
我希望按照
的顺序生成输出Claim :Photograph shows a Chicago Bears fan holding a crude sign at the 2006-07 NFC championship game.
Status:True.
Example:The attached photo has been circulating around the Gulf Coast region for a couple of days now (since Saturday's Bears-Saints game). Do you have any word on whether it is authentic or doctored? Was this individual really that tasteless and crude?
查看JSoup文档后,它显示了基于标记获取信息的方法。但是如何使用JSoup获得所需的输出?任何样品或样品的替代品将不胜感激。
答案 0 :(得分:3)
我认为你只想通过剥离HTML实体来获取文本部分。应该工作
Jsoup.parse("yoursInputString").text();