Question

我正在使用Ben Reeves' HTML parser来解析带有一些HTML标记的文本。它将每个节点表示为HTMLNode对象，它只有一个来自libxml2的xmlNode *类型的ivar。 xmlNode是一个如下所示的结构：

struct _xmlNode {
void           *_private;   /* application data */
xmlElementType   type;  /* type number, must be second ! */
const xmlChar   *name;      /* the name of the node, or the entity */
struct _xmlNode *children;  /* parent->childs link */
struct _xmlNode *last;  /* last child link */
struct _xmlNode *parent;    /* child->parent link */
struct _xmlNode *next;  /* next sibling link  */
struct _xmlNode *prev;  /* previous sibling link  */
struct _xmlDoc  *doc;   /* the containing document */

/* End of common part */
xmlNs           *ns;        /* pointer to the associated namespace */
xmlChar         *content;   /* the content */
struct _xmlAttr *properties;/* properties list */
xmlNs           *nsDef;     /* namespace definitions on this node */
void            *psvi;  /* for type/PSVI informations */
unsigned short   line;  /* line number */
unsigned short   extra; /* extra data for XPath/XSLT */

};

我有一个方法，它接受一个字符串，将其包装到HTMLNode中并返回该节点：

- (HTMLNode*)nodeFromString:(NSString*)string 
{
    /* Creates parser which wraps string in <doc><html><body> tags */

    HTMLParser *parser = [[HTMLParser alloc] initWithString:string error:nil];

    /* Get contents of <body> tag and return it to parse later */

    HTMLNode *body = [parser body];    
    return body;
}

在此方法中使用此HTMLNode很好。但是，如果我尝试在代码中的其他位置使用此节点，我会得到非常奇怪的结果。 xmlNode结构中的大多数变量都指向内存中的一些随机位置。

这是HTMLNode的调试输出在nodeFromString方法中的样子：

body    HTMLNode *  0x7faaf96a3240  0x00007faaf96a3240
_node   xmlNode *   0x7faaf96b7ec0  0x00007faaf96b7ec0
    _private    void *  NULL    0x0000000000000000
    type    xmlElementType  XML_ELEMENT_NODE    XML_ELEMENT_NODE
    name    const xmlChar * "body"  0x00007faaf9693df0
    children    _xmlNode *  0x7faaf96b7fd0  0x00007faaf96b7fd0
        _private    void *  NULL    0x0000000000000000
        type    xmlElementType  XML_ELEMENT_NODE    XML_ELEMENT_NODE
        name    const xmlChar * "p" 0x00007faaf9678470
        children    _xmlNode *  0x7faaf96b80e0  0x00007faaf96b80e0
            _private    void *  NULL    0x0000000000000000
            type    xmlElementType  XML_TEXT_NODE   XML_TEXT_NODE
            name    const xmlChar * "text"  0x0000000100e31304
            children    _xmlNode *  NULL    0x0000000000000000
            content xmlChar *   "My content string" 0x00007faafa910200

这是从此方法返回的相同HTMLNode对象的调试输出，并在其他地方使用：

body    HTMLNode *  0x7faaf96a3240  0x00007faaf96a3240
_node   xmlNode *   0x7faaf96b7ec0  0x00007faaf96b7ec0
    _private    void *  0x900007faaf96b7db  0x900007faaf96b7db
    type    xmlElementType  -1349076995 -1349076995
    name    const xmlChar * 0x7faaf969000a  0x00007faaf969000a
    children    _xmlNode *  0x7faaf96b7fd0  0x00007faaf96b7fd0
        _private    void *  0x600007faaf96b7ec  0x600007faaf96b7ec
        type    xmlElementType  -1349076978 -1349076978
        name    const xmlChar * ""  0x00007faaf967000a
        children    _xmlNode *  0x7faaf96b80e0  0x00007faaf96b80e0
            _private    void *  0x700007faaf96b7fd  0x700007faaf96b7fd
            type    xmlElementType  -1349076961 -1349076961
            name    const xmlChar * "XPathEvalExpression: %d object left on the stack\n"    0x0000000100e3000a
            children    _xmlNode *  NULL    0x0000000000000000
            content xmlChar *   "My content string" 0x00007faafa910200

为什么xmlNode ivar的内存已损坏？我该怎么做才能阻止它（我真的不想在一个方法中解析整个字符串）？

可以找到重现此问题的简单示例项目here。

Answer 1

我认为这是解析器的错误。使用HTMLParser对象释放Xml层次结构。

Objective C中struct ivar的奇怪内存行为

1 个答案: