Question

我想知道是否有人对我的问题有任何想法。我需要从UIWebView加载的html文件中提取所有图像文件。我将文件加载到NSString中，现在需要解析该文件。我已经创建了一个使用componentsSeparatedByString搜索.jpg，.gif等的数组。然后尝试向后工作以到达文件的开头。我最好的解决方案是能够将html解析为包含img src =“source”width =“”height =“”等的NSArray

任何帮助或提示将不胜感激。我的最后努力是从整个文件的左到右搜索/替换以找到我需要的字符串，但希望有更快的方法。

Answer 1

不解析HTML，请使用libxml2。它具有广泛的面向HTML的解析/遍历功能，可让您通过元素以编程方式导航文档。

我没有可以使用面向HTML的示例代码，但只需要htmlReadDoc()来获取已解析的文档;然后从read tree example调整您的遍历。

void print_element_names(xmlNode * a_node)
{
    xmlNode *cur_node = NULL;

    for (cur_node = a_node; cur_node; cur_node = cur_node->next) {
        if (cur_node->type == XML_ELEMENT_NODE) {
            printf("node type: Element, name: %s\n", cur_node->name);
        }

        print_element_names(cur_node->children);
    }
}

// ... call your version of this function with the root node of the document

HTML Image String Parser

1 个答案: