iOS Hpple HTML解析

时间:2015-07-08 23:51:07

标签: html ios objective-c uitableview hpple

所以,我需要在我的应用程序中将网站的内容解析为tableview。我尝试过hpple,在一些测试用例中它可以工作。但在我的具体情况下,我无法让它发挥作用...... HTML:



<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
<html>
   <head>
      <link rel="stylesheet" type="text/css" href="willi.css">
      </link><script src="style.js" type="text/javascript"></script>
      <title>Homepage</title>
   </head>
   <body>
      <a name="oben"/>
         <h1>Date</h1>
         <br />
      <a href="#07.07.2015">07.07.2015</a><br />
      <a href="#07.08.2015">07.08.2015</a><br />
      <a name="07.07.2015">
         <hr />
      </a>
      <p class="page" style="text-align:left">
      <h2>Date Tue, 7.7.2015</h2>
      created: 7.7. 16:35 </p>
      <p class="page" style="text-align:left">
      <table class="F" border-width="3">
         <colgroup>
            <col width="899"/>
         </colgroup>
         <tr class="F">
            <th rowspan="1" class="F">
               ***&nbsp;&nbsp; Version 1&nbsp;&nbsp; ***
            </th>
         </tr>
         <tr class="F">
            <th rowspan="1" class="F"></th>
         </tr>
         <tr class="F">
            <th rowspan="1" class="F">
               Testmessage 1
            </th>
         </tr>
         <tr class="F">
            <th rowspan="1" class="F">
               Testmessage 2
            </th>
         </tr>
         <tr class="F">
            <th rowspan="1" class="F">
               Testmessage 3
            </th>
         </tr>
         <tr class="F">
            <th rowspan="1" class="F"></th>
         </tr>
         <tr class="F">
            <th rowspan="1" class="F">
               Testmessage 4
            </th>
         </tr>
      </table>
      </p>
      <p class="seite" style="text-align:left">
      <h4>List:</h4>
      <table class="k" border-width="3">
         <tr>
            <th width="50">
               Team
            </th>
            <th width="50">
               &nbsp;Name
            </th>
            <th width="50">
               Nr.
            </th>
            <th width="50">
               &nbsp;Mate
            </th>
            <th width="50">
               Spot
            </th>
            <th width="50">
               &nbsp;Map
            </th>
            <th width="150"></th>
         </tr>
         <tr class="k">
            <th rowspan="5" class="k">
               A
            </th>
            <td>
               &nbsp;First
            </td>
            <td>
               3
            </td>
            <td>
               &nbsp;
            </td>
            <td></td>
            <td>
               &nbsp;
            </td>
            <td>
               &nbsp;Test
            </td>
         </tr>
         <tr>
            <td>
               &nbsp;Second
            </td>
            <td>
               4
            </td>
            <td>
               &nbsp;
            </td>
            <td></td>
            <td>
               &nbsp;
            </td>
            <td>
               &nbsp;Test
            </td>
         </tr>
         <tr>
            <td>
               &nbsp;Sie
            </td>
            <td>
               8
            </td>
            <td>
               &nbsp;
            </td>
            <td></td>
            <td>
               &nbsp;
            </td>
            <td>
               &nbsp;Test
            </td>
         </tr>
         <tr>
            <td>
               &nbsp;Sie
            </td>
            <td>
               9
            </td>
            <td>
               &nbsp;
            </td>
            <td></td>
            <td>
               &nbsp;
            </td>
            <td>
               &nbsp;Test
            </td>
         </tr>
         <tr>
            <td>
               &nbsp;Es
            </td>
            <td>
               10
            </td>
            <td>
               &nbsp;
            </td>
            <td></td>
            <td>
               &nbsp;
            </td>
            <td>
               &nbsp;Test
            </td>
         </tr>
         <tr class="k">
            <th rowspan="1" class="k">
               B
            </th>
            <td>
               &nbsp;Red
            </td>
            <td>
               11
            </td>
            <td>
               &nbsp;
            </td>
            <td></td>
            <td>
               &nbsp;
            </td>
            <td>
               &nbsp;Test
            </td>
         </tr>
      </table>
      </p>
      <hr />
      <a name="07.08.2015">
         <hr />
      </a>
      <p class="page" style="text-align:left">
      <h2>Date Thu, 8.7.2015</h2>
      created: 7.7. 16:35 </p>
      <p class="page" style="text-align:left">
      <table class="F" border-width="3">
         <colgroup>
            <col width="899"/>
         </colgroup>
         <tr class="F">
            <th rowspan="1" class="F">
               ***&nbsp;&nbsp; Version 1&nbsp;&nbsp; ***
            </th>
         </tr>
      </table>
      </p>
      <p class="page" style="text-align:left">
      <h4>List:</h4>
      <table class="k" border-width="3">
         <tr>
            <th width="50">
               Team
            </th>
            <th width="50">
               &nbsp;Name
            </th>
            <th width="50">
               Nr.
            </th>
            <th width="50">
               &nbsp;Mate
            </th>
            <th width="50">
               Spot
            </th>
            <th width="50">
               &nbsp;Map
            </th>
            <th width="150"></th>
         </tr>
         <tr class="k">
            <th rowspan="5" class="k">
               C
            </th>
            <td>
               &nbsp;Dnk
            </td>
            <td>
               1
            </td>
            <td>
               &nbsp;
            </td>
            <td></td>
            <td>
               &nbsp;
            </td>
            <td>
               &nbsp;Test
            </td>
         </tr>
         <tr>
            <td>
               &nbsp;Es
            </td>
            <td>
               1
            </td>
            <td>
               &nbsp;
            </td>
            <td></td>
            <td>
               &nbsp;
            </td>
            <td>
               &nbsp;Test
            </td>
         </tr>
         <tr>
            <td>
               &nbsp;Dnk
            </td>
            <td>
               2
            </td>
            <td>
               &nbsp;
            </td>
            <td></td>
            <td>
               &nbsp;
            </td>
            <td>
               &nbsp;Test
            </td>
         </tr>
         <tr>
            <td>
               &nbsp;Esta
            </td>
            <td>
               2
            </td>
            <td>
               &nbsp;
            </td>
            <td></td>
            <td>
               &nbsp;
            </td>
            <td>
               &nbsp;Test
            </td>
         </tr>
         <tr>
            <td>
               &nbsp;SWB
            </td>
            <td>
               6
            </td>
            <td>
               &nbsp;Naau
            </td>
            <td>
               F
            </td>
            <td>
               &nbsp;Test
            </td>
            <td>
               &nbsp;
            </td>
         </tr>
      </table>
      </p>
      <hr />
   </body>
</html>
&#13;
&#13;
&#13;

该页面包含两个主要元素(<table></table>),其中包含我要填充UITableView的内容。

我的目标是每个表有一个部分,每个部分内部包含表格的所有内容。 节标题名称应为&#34;日期&#34;。

TFHpple *Parser = [TFHpple hppleWithHTMLData:HtmlData];

NSString *XpathQueryString = @"/html/body/a";
NSArray *Nodes = [Parser searchWithXPathQuery:XpathQueryString];

for (TFHppleElement *element in Nodes) {
    NSString *temp = [[element firstChild] content];
    if (temp.length == 10) {
        [Day addObject:temp];
    }
}

在我NSMutableArray *Day我保存日期,这很好用。我得到2个正确名称的部分。 但是当我尝试接收表格内容时,我无法让它工作...... 我想要像

这样的东西
tableElement* newElement = [[tableElement alloc] init];
newElement.day = @"07.07.2015";
newElement.team = @"A";
newElement.name = @"First";
newElement.nr = @"3";
newElement.mate = @"";
newElement.spot = @"";
newElement.map = @"";
newElement.status = @"Test";

然后我可以将所有newElement(s)存储在一个数组中,将日期二的所有元素存储在另一个元素中。

  

编辑:例如newElement.day = @"07.07.2015";当然需要   像newElement.day = [[hppleparse firstChild] content];

这样的东西

1 个答案:

答案 0 :(得分:1)

使用HTMLKit可以轻松实现这一目标。

以下是使用您提供的HTML可以执行的操作的几个示例:

HTMLDocument *document = [HTMLDocument documentWithString:html];
NSMutableArray *days = [ NSMutableArray array];
NSArray *links = [document querySelectorAll:@"a"];
for (HTMLElement *link in links) {
  if (link.textContent.length == 10) {
    [days addObject:link.textContent];
  }
}

// For example you can:
// Get all <tr> elements that are children of the table with className 'k'
NSArray *tableKRows = [document querySelectorAll:@"table.k > tr"];

// Get all <td> elements that are descendants of the table with className 'k'
NSArray *tableKData = [document querySelectorAll:@"table.k td"];

// Collect content of all <td> elements in `array`
NSMutableArray *array = [NSMutableArray array];
for (HTMLElement *td in tableKData) {
  NSString *content = [td.textContent stringByTrimmingCharactersInSet:[NSCharacterSet whitespaceAndNewlineCharacterSet]];
  [array addObject:content];
}

如果您需要任何进一步的帮助,请与我们联系。

HTMLKit是一个支持CSS3选择器的纯Objective-C HTML解析器。它不是libxml或任何其他库的包装器,而是完整的WHATWG HTML规范兼容实现。