使用libxml2解析XML文件的最快方法?

时间:2014-01-31 11:15:57

标签: c++ parsing libxml2

您是否有使用libxml2解析XML文件的“更快”方法? 现在我按照C ++代码那样做:

void parse_element_names(xmlNode * a_node, int *calls)
{
    xmlNode *cur_node = NULL;

    for (cur_node = a_node; cur_node; cur_node = cur_node->next) {
      (*calls)++;
      if(xmlStrEqual(xmlCharStrdup("to"),cur_node->name)){
        //printf("node type: <%d>, name <%s>, content: <%s> \n", cur_node->children->type, cur_node->children->name, cur_node->children->content);
        //do something with the content
        parse_element_names(cur_node->children->children,calls);
        }
      else if(xmlStrEqual(xmlCharStrdup("from"),cur_node->name)) {
        //printf("node type: <%d>, name <%s>, content: <%s> \n", cur_node->children->type, cur_node->children->name, cur_node->children->content);
        //do something with the content
        parse_element_names(cur_node->children->children,calls);
        }
      else if(xmlStrEqual(xmlCharStrdup("note"),cur_node->name)) {
        //printf("node type: <%d>, name <%s>, content: <%s> \n", cur_node->children->type, cur_node->children->name, cur_node->children->content);
        //do something with the content
        parse_element_names(cur_node->children->children,calls);
        }
        .
        .
        .
        //about 100 more node names comming
      else{
        parse_element_names(cur_node->children,calls);
      }
    }

}
int main(int argc, char **argv)
{ 

    xmlDoc *doc = NULL;
    xmlNode *root_element = NULL;

    if (argc != 2)
        return(1);

    /*parse the file and get the DOM */
    doc = xmlReadFile(argv[1], NULL, XML_PARSE_NOBLANKS);

    if (doc == NULL) {
        printf("error: could not parse file %s\n", argv[1]);
    }
    int calls = 0;
    /*Get the root element node */
    root_element = xmlDocGetRootElement(doc);
    parse_element_names(root_element,&calls);

    /*free the document */
    xmlFreeDoc(doc);

    xmlCleanupParser();

    return 0;
}

这真的是最快的方式吗?或者你有什么更好/更快的解决方案可以给我建议吗?

谢谢

1 个答案:

答案 0 :(得分:2)

xmlReadFile等。基于libxml2的SAX parser interface(实际上是SAX2 interface),因此如果您不需要生成的xmlDoc,使用自己的SAX解析器通常会更快。

如果必须区分许多不同的元素名称,例如,最快的方法通常是为每种类型的节点创建单独的函数,并使用哈希表来查找这些函数。