Question

我正在使用libxml2来解析HTML。 HTML可能如下所示：

<div>
    Some very very long text here.
</div>

我想插入一个子节点，例如在文本之前的标题，如下所示：

<div>
    <h3>
        Some header here
    </h3>
    Some very very long text here.
</div>

不幸的是，libxml2总是在文本后添加我的标题，如下所示：

<div>
    Some very very long text here.
    <h3>
        Some header here
    </h3>
</div>

我该如何解决这个问题？

Answer 1

文本内容是子节点，因此您可以获取指向文本节点的指针，并使用xmlAddPrevSibling函数添加元素。这是一个示例，但没有错误处理或正确清理。

xmlInitParser();

// Create an XML document
std::string content( "<html><head/><body><div>Some long text here</div></body></html>" );
xmlDocPtr doc = xmlReadMemory( content.c_str(), content.size(), "noname.xml", 0, 0 );

// Query the XML document with XPATH, we could use the XPATH text() function 
// to get the text node directly but for the sake of the example we'll get the
// parent 'div' node and iterate its child nodes instead.
std::string xpathExpr( "/html/body/div" );
xmlXPathContextPtr xpathCtx = xmlXPathNewContext( doc );
xmlXPathObjectPtr xpathObj = xmlXPathEvalExpression( BAD_CAST xpathExpr.c_str(), xpathCtx );

// Get the div node
xmlNodeSetPtr nodes = xpathObj->nodesetval;
xmlNodePtr divNode = nodes->nodeTab[ 0 ];

// Iterate the div child nodes, though in this example we know
// there'll only be one node, the text node.
xmlNodePtr divChildNode = divNode->xmlChildrenNode;
while( divChildNode != 0 )
    {
    if( xmlNodeIsText( divChildNode ) )
        {
        // Create a new element with text node
        xmlNodePtr headingNode = xmlNewNode( 0, BAD_CAST "h3" );
        xmlNodePtr headingChildNode = xmlNewText( BAD_CAST "Some heading here" );
        xmlAddChild( headingNode, headingChildNode );

        // Add the new element to the existing tree before the text content
        xmlAddPrevSibling( divChildNode, headingNode );
        break;
        }
    divChildNode = divChildNode->next;
    }

// Display the result
xmlDocDump( stdout, doc );

xmlCleanupParser();

libxml2 - 在父节点的内容之前插入子节点

1 个答案: