Question

我有一个带有xml代码的字符串。我想逐行阅读，所以我可以在“标题”标签之间提取字符串我知道如何提取标题，但我如何遍历字符串？
听起来很简单但我现在没有任何想法谢谢你提前。

Answer 1

也许您可以提供更多有关在“标题”标记之间提取字符串的详细信息？

如果你已经可以提取标题标签，那么这意味着你知道它们的位置，那么提取字符串只是将开始和结束标题标签之间的子串正确吗？

您在寻找XML解析器吗？开源libxml效果很好，并且具有各种语言的绑定。还有其他解析器，解析器允许您执行的操作是获取XML字符串并创建树数据结构，以便您轻松访问XML元素。

编辑：问题中最初不存在关于不使用xml解析器的要求。这是一个创建自己的XML解析器的粗略算法。

1）创建树数据结构和递归parse（）函数。 2）搜索XML标签，任何具有模式＆lt; ...＆gt;的东西。将“...”标记添加到您所在的当前节点的其中一个子节点上，然后再次调用递归parse（）函数。 3）如果您找到一个关闭原始＆lt; ...＆gt;的XML标记，那么您就完成了解析该块的工作。回到第2步。如果没有其他块，则从解析函数返回。

这是一些伪代码：

// node: The current node in the tree
// current_position: the current position in the XML string that you are parsing
// string: the XML string that you are parsing.
parse(node, current_position, string):
    while current_position < len(string):
        current_position = find(string[current_position:len(string)], "<...>")
        if !found: return current_position // should be end of string if nothing is found.
        node.children[node.num_children] = new Node("<...>");
        current_position = parse(node.children[node.num_children],current_position+size_of_tag,string)
        current_position = find(string[current_position:len(string)], "</...>")
        node.num_children++
    return current_position

如何在C ++中逐行读取字符串

1 个答案: