hpple解析器获取带有标签的所有内容

时间:2013-01-29 21:07:28

标签: hpple

我有一大块HTML,我试图获取div中的所有内容,但我无法使用[element content]或[element text]检索它。

<div class="text_comment" id="xxx">
                    <blockquote><i><i><a>some text</a><br></i></i>
<blockquote>lorem ipsum ...</blockquote>
</blockquote>
<p>some text</p>
<p>lorem ipsum...</p>
<blockquote>another text</blockquote>
</blockquote>
<p>another text</p>             
</div>

我尝试使用标签检索所有内部div,例如

   <blockquote><i><i><a>some text</a><br></i></i>
    <blockquote>lorem ipsum ...</blockquote>
    </blockquote>
    <p>some text</p>
    <p>lorem ipsum...</p>
    <blockquote>another text</blockquote>
    </blockquote>
    <p>another text</p>

任何人都可以帮助我。

1 个答案:

答案 0 :(得分:1)

解决了,如果有人需要这个,只需做一些小改动:

TFHppleElement.h

@property (nonatomic, copy, readonly) NSString *raw;

TFHppleElement.m

- (NSString *)raw
{
    return [node objectForKey:@"raw"];
}

XPathQuery.m

NSDictionary *DictionaryForNode(xmlNodePtr currentNode, NSMutableDictionary *parentResult,BOOL parentContent)
{
    ...
    xmlBufferPtr buffer = xmlBufferCreate();
    xmlNodeDump(buffer, currentNode->doc, currentNode, 0, 0);

    NSString *rawContent = [NSString stringWithCString:(const char *)buffer->content encoding:NSUTF8StringEncoding];
    [resultForNode setObject:rawContent forKey:@"raw"];

    xmlBufferFree(buffer);

  return resultForNode;
}