我需要从网页源代码中提取文本,但html很难看,主要是tr和td标记元素。
例如,在第二个tr元素中,我想要文本'80 Sand Bass,10 Barracuda,1 Bonito,15 Rockfish'。在第三个元素我想要文本'55 Calico Bass,30 Barracuda,8 Bonito,15 Rockfish'
在这种情况下,从源代码中提取某些部分的最佳方法是什么 由于缺少关键字ID,我无法使用stringByEvaluatingJavaScriptFromString。
以下代码返回一个包含页面完整内容的字符串,但是如何搜索我想要的行。
NSString *hmlanding = @"http://www.hmlanding.com/fcount/fcount.htm";
NSURL *hmlandingURL = [NSURL hmlanding];
NSError *error;
NSString *hmlandingPage = [NSString stringWithContentsOfURL:hmlandingURL
encoding:NSASCIIStringEncoding
error:&error];
//部分丑陋的html源代码
<tr>
<td align="center" valign="top" bgcolor="#CCCCFF" height="38" width="202">
<b><A HREF="fcount_malihini.htm" target="_blank">Malihini<font size="2">
</font>
</A><font size="2"> (Mexico freelance)</font><BR>
<A HREF="http://www.malihinisportfishing.com/report.html" target="_blank">
<FONT SIZE="-1"><I>On the Water Report</I></FONT> </A> </b></td>
<td align="center" valign="top" bgcolor="#CCCCFF" height="38" width="61">
44</td>
<td align="center" valign="top" bgcolor="#CCCCFF" height="38" width="435">
121 Yellowtail, 1 Dorado, 3 Yellowfin Tuna</td>
</tr>
<tr>
<td align="center" valign="top" bgcolor="#CCCCFF" height="19" width="202">
<strong><A HREF="fcount_premier.htm" target="_blank">Premier<span style="text-decoration: none">
AM<font size="2" color="#000000">
</font></span></A></strong></td>
<td align="center" valign="top" bgcolor="#CCCCFF" height="19" width="61">
59</td>
<td align="center" valign="top" bgcolor="#CCCCFF" height="19" width="435">
80 Sand Bass, 10 Barracuda, 1 Bonito, 15 Rockfish</td>
</tr>
<tr>
<td align="center" valign="top" bgcolor="#CCCCFF" height="19" width="202">
<strong>
<u>
<font color="#0000CC"><A HREF="fcount_premier.htm" target="_blank">
Premier</a></font><font color="#0000FF"><span style="text-decoration: none">
PM</span></font></u></strong></td>
<td align="center" valign="top" bgcolor="#CCCCFF" height="19" width="61">
29</td>
<td align="center" valign="top" bgcolor="#CCCCFF" height="19" width="435">
55 Calico Bass, 30 Barracuda, 8 Bonito, 15 Rockfish</td>
</tr>
<tr>
<td bgcolor="#CCCCFF">
<p align="center"><b><font color="#0000CC"><u>Premier Twilight </u>
<font size="2"> </font></font></b></td>
<td bgcolor="#CCCCFF">
<p align="center"> </td>
<td align="center" bgcolor="#CCCCFF" height="20" width="435"> </td>
</tr>